GCVSPL

From Software Product Documentation
Jump to navigation Jump to search
Language:  English  • français • italiano • português • español 

GCVSPL was implemented based on the following article:


H.J. Woltring (1986), A FORTRAN package for generalized, cross-validatory spline smoothing and differentiation. Advances in Engineering Software 8(2):104-113 (U.K.).


The command is used like all other filter commands in Visual3D.

For example, to filter all TARGET signals in the ACTIVE FILES.

GCVSPL
/SIGNAL_TYPES=TARGET
! /SIGNAL_NAMES=
! /SIGNAL_FOLDER=ORIGINAL
! /SIGNAL_COMPONENTS=
! /RESULT_FOLDER=PROCESSED
! /EVENT_SEQUENCE=
! /EXCLUDE_EVENTS=
! /NUM_SPLINE_ORDER=2
! /ERROR_VARIANCE=0.01
! /OPTIMIZATION_MODE=3
! /DERIVATIVE_ORDER=0
! /MAX_GAP=0
! /FILL_GAPS=FALSE
;

Num_Spline_Order

1 = linear
2 = cubic
3 = quintic
4 = heptic splines.

Error_Variance

0 = an interpolating spline is calculated.
<0 & Optimization Mode= 2 the smoothing parameter is determined by minimizing the Generalized Cross-Validation function
>0 & Optimization Mode= 1 the smoothing parameter is specified by the variance
>0 & Optimization Mode= 3 the smoothing parameter is determined so as to minimize an estimate of the true mean squared error, which depends on the variance.

Woltring's optimization mode= 4 has not been implemented.

The default value of the error variance is 0.01

This default value was set many years ago and in retrospect was not the ideal choice. In order to maintain backwards compatibility, we cannot modify the default.

The recommended value of the error variance is 0.0001 m^2

Cutoff Frequency

The relationship between the variance and the cutoff frequency was declared here:

[[1] and further described here:
From Biomch-L


First account for a pass butterworth filter (e.g. a fourth order butterworth filter)
Modified_cut_off_freq= cut_off_freq / 0.802
Variance = sampling_freq/(2*PI*Modified_cut_off_freq)^(2*order)

Cutoff Frequency Test

Create a SIN wave
Filter the SIN wave with a cutoff frequency equal to the SIN wave frequency
The resulting signal should have a magnitude (1/sqrt(2)) times the original magnitude.
And should yield the same result as a LOWPASS filter with a 12 Hz cutoff
An example test script in Visual3D
! create a 12 Hz SIN wave at the ANALOG frequency of the file opened in the Visual3D workspace.
Set_Pipeline_Parameter_From_Expression
/PARAMETER_NAME=CUTOFF
/EXPRESSION= 12/0.802
/AS_INTEGER=FALSE
;
Set_Pipeline_Parameter_From_Expression
/PARAMETER_NAME=VARIANCE
/EXPRESSION=PARAMETERS::ANALOG::RATE/(2*pi()*&::CUTOFF&)^4
/AS_INTEGER=FALSE
;
Evaluate_Expression
/EXPRESSION=SIN(2*PI()*12*FRAME_NUMBERS::ORIGINAL::ANALOGTIME)
! /SIGNAL_TYPES=
! /SIGNAL_FOLDER=ORIGINAL
! /SIGNAL_NAMES=
/RESULT_TYPES=DERIVED
/RESULT_FOLDERS=PROCESSED
/RESULT_NAME=SCOTT
! /APPLY_AS_SUFFIX_TO_SIGNAL_NAME=FALSE
;
GCVSPL
/SIGNAL_TYPES=DERIVED
/SIGNAL_FOLDER=PROCESSED
/SIGNAL_NAMES=SCOTT
! /SIGNAL_COMPONENTS=
/RESULT_FOLDERS=FILTERED
! /EVENT_SEQUENCE=
! /EXCLUDE_EVENTS=
! /NUM_SPLINE_ORDER=2
/ERROR_VARIANCE=::VARIANCE
! /OPTIMIZATION_MODE=1
! /DERIVATIVE_ORDER=0
! /MAX_GAP=0
! /FILL_GAPS=FALSE
;
Lowpass_Filter
/SIGNAL_TYPES=DERIVED
/SIGNAL_FOLDER=PROCESSED
/SIGNAL_NAMES=SCOTT
/RESULT_FOLDER=LOWPASS
! /RESULT_SUFFIX=
! /FILTER_CLASS=BUTTERWORTH
/FREQUENCY_CUTOFF=12
/NUM_REFLECTED=0
! /NUM_EXTRAPOLATED=0
/TOTAL_BUFFER_SIZE=0
! /NUM_BIDIRECTIONAL_PASSES=1
;

ISB Information

The ISB website contains the following information on the algorithm


For large datasets (N >> 0) and negligible boundary artefacts, the behaviour of a natural spline approximates that of a periodic spline. For the latter case, the frequency characteristic in the equidistantly sampled, uniformly weighted case is that of a double, phase-symmetric Butterworth filter, with transfer function H(w) = [1 + (w/wo)^2M]^-1, where w is the frequency, wo = (p*T)^(-0.5/M) the filter's cut-off frequency, p the smoothing parameter, T the sampling interval, and 2M the order of the spline. If T is expressed in seconds, the frequen- cies are expressed in radians/second.


It has been found empirically, that the effective number of estimated spline parameters Np is related to the Butterworth cut-off frequency wo as Np ~ M/2 + KM * wo * N * T, where Np ranges between M and N, and where KM is the integral over x from 0 to infinity of (1 + x^2M)^-1 divided by PI. For large M, KM approaches 1/PI from above; values for small M are: K1 = 1/2, K2 = 1/V8, K3 = 1/3. This relation has also been found to apply for uniformly weighted data which are sampled slightly anequidistantly, with T taken as the average sampling inter- val. For large Np, the relation with wo * N * T becomes nonlinear.

Remarks from the original Woltring code

  1. GVCSPL calculates a natural spline of order 2*M (degree 2*M-1) which smooths or interpolates a given set of data points, using statistical considerations to determine the amount of smoothing required (Craven & Wahba, 1979). If the error variance is a priori known, it should be supplied to the routine in VAR , for |MD|=3. The degree of smoothing is then determined to minimize an unbiased estimate of the true mean squared error. On the other hand, if the error variance is not known, , one may select |MD|=2, and VAR should be set to a negative number. The routine then determines the degree of smoothing to minimize the generalized cross validation function. This is asymptotically the same as minimizing the true mean squared error (Craven & Wahba, 1979). If the estimates from |MD|=2 or 3 do not appear suitable to the user (as apparent from the smoothness of the M-th derivative or from the effective number of degrees of freedom returned in WK(3) ), the user may select an other value for the noise variance if |MD|=3, or a reasonably large number of degrees of freedom if |MD|=4. If |MD|=1, the procedure is non-iterative, and returns a spline for the given value of the smoothing parameter p as entered in VAL.
  2. The number of arithmetic operations and the amount of storage required are both proportional to n, so very large datasets may be accomodated. The data points do not have to be equidistant in the independant variable X or uniformly weighted in the dependant variable Y.
  3. If |MD|=3 (a priori known noise variance), any value of N.ge.2*M is acceptable. However, it is advisable for N-2*M be rather large (at least 20) if |MD|=2 (GCV).
  4. For |MD| > 1, GCVSPL tries to iteratively minimize the selected criterion function. This minimum is unique for |MD| = 4, but not necessarily for |MD| = 2 or 3. Consequently, local optima rather that the global optimum might be found, and some actual findings suggest that local optima might yield more meaningful results than the global optimum if N is small. Therefore, the user has some control over the search procedure. If MD > 1, the iterative search starts from a value which yields a number of degrees of freedom which is approximately equal to N/2, until the first (local) minimum is found via a golden section search procedure (Utreras, 1980). If MD < -1, the value for p contained in WK(4) is used instead. Thus, if MD = 2 or 3 yield too noisy an estimate, the user might try |MD| = 1 or 4, for suitably selected values for p or for the number of degrees of freedom, and then run GCVSPL with MD = -2 or -3. The contents of N, M, K, X, WX, WY, and WK are assumed unchanged since the last call to GCVSPL if MD < 0.
  5. When VAR is a priori known, any value of N.ge.2*M is acceptable. It is advisable, however, for N to be rather large (if M.eq.2, about 20) when VAR is unknown. If the degree of smoothing done by GCVSPL when VAR is unknown is not satisfactory, the user should try specifying the degree of smoothing by setting VAR to a reasonable value.
  6. GCVSPL calculates the spline coefficient array C(N); this array can be used to calculate the spline function value and any of its derivatives up to the degree 2*M-1 at any argument T within the knot range, using subroutines SPLDER and SEARCH, and the knot array X(N). Since the spline is constrained at its Mth derivative, only the lower spline derivatives will tend to be reliable estimates of the underlying signal's true derivatives.
  7. GCVSPL combines elements of subroutine CRVO5 by Utreras (1980), subroutine SMOOTH by Lyche et al. (1983), and subroutine CUBGCV by Hutchinson (1985). The trace of the influence matrix is assessed in a similar way as described by Hutchinson & de Hoog (1985). The major difference is that the present approach utilizes non-symmetrical B-spline design matrices as described by Lyche et al. (1983); therefore, the original algorithm by Erisman & Tinney (1975) has been used, rather than the symmetrical version adopted by Hutchinson & de Hoog.
Retrieved from ""