next up previous
Next: Iterative multivariate surrogates Up: Fourier based surrogates Previous: Example: Southern oscillation index

Periodicity artefacts

 

The randomisation schemes discussed so far all base the quantification of linear correlations on the Fourier amplitudes of the data. Unfortunately, this is not exactly what we want. Remember that the autocorrelation structure given by
 equation1047
corresponds to the Fourier amplitudes only if the time series is one period of a sequence that repeats itself every N time steps. This is, however, not what we believe to be the case. Neither is it compatible with the null hypothesis. Conserving the Fourier amplitudes of the data means that the periodic auto-covariance function
 equation1049
is reproduced, rather than tex2html_wrap_inline2060. This seemingly harmless difference can lead to serious artefacts in the surrogates, and, consequently, spurious rejections in a test. In particular, any mismatch between the beginning and the end of a time series poses problems, as discussed e.g. in Ref. [7]. In spectral estimation, problems caused by edge effects are dealt with by windowing and zero padding. None of these techniques have been successfully implemented for the phase randomisation of surrogates since they destroy the invertibility of the transform.

 figure1046
Figure:   Effect of end point mismatch on Fourier based surrogates. Upper trace: 1500 iterates of tex2html_wrap_inline2056. Lower trace: a surrogate sequence with the same Fourier amplitudes. Observe the additional ``crinkliness'' of the surrogate.

Let us illustrate the artefact generated by an end point mismatch with an example. In order to generate an effect that is large enough to be detected visually, consider 1500 iterates of the almost unstable AR(2) process, tex2html_wrap_inline2056 (upper trace of Fig. 6). The sequence is highly correlated and there is a rather big difference between the first and the last points. Upon periodic continuation, we see a jump between tex2html_wrap_inline2064 and tex2html_wrap_inline2066. Such a jump has spectral power at all frequencies but with delicately tuned phases. In surrogate time series conserving the Fourier amplitudes, the phases are randomised and the spectral content of the jump is spread in time. In the surrogate sequence shown as the lower trace in Fig. 6, the additional spectral power is mainly visible as a high frequency component. It is quite clear that the difference between the data and such surrogates will be easily been picked up by, say, a nonlinear predictor, and can lead to spurious rejections of the null hypothesis.

 figure1051
Figure:   Repair of end point mismatch by selecting a sub-sequence of length 1350 of the signal shown in Fig. 6 that has an almost perfect match of end points. The surrogate shows no spurious high frequency structure.

The problem of non-matching ends can often be overcome by choosing a sub-interval of the recording such that the end points do match as closely as possible [33]. The possibly remaining finite phase slip at the matching points usually is of lesser importance. It can become dominant, though, if the signal is otherwise rather smooth. As a systematic strategy, let us propose to measure the end point mismatch by
 equation1052
and the mismatch in the first derivative by
 equation1054
The fractions tex2html_wrap_inline2068 and tex2html_wrap_inline2070 give the contributions to the total power of the series of the mismatch of the end points and the first derivatives, respectively. For the series shown in Fig. 6, tex2html_wrap_inline2072 and the end effect dominates the high frequency end of the spectrum. By systematically going through shorter and shorter sub-sequences of the data, we find that a segment of 1350 points starting at sample 102 yields tex2html_wrap_inline2074 or an almost perfect match. That sequence is shown as the upper trace of Fig. 7, together with a surrogate (lower trace). The spurious ``crinkliness'' is removed.

In practical situations, the matching of end points is a simple and mostly sufficient precaution that should not be neglected. Let us mention that the SOI data discussed before is rather well behaved with little end-to-end mismatch (tex2html_wrap_inline2076). Therefore we didn't have to worry about the periodicity artefact.

The only method that has been proposed so far that strictly implements tex2html_wrap_inline2060 rather than tex2html_wrap_inline2080 is given in Ref. [26] and will be discussed in detail in Sec. 5 below. The method is very accurate but also rather costly in terms of computer time. It should be used in cases of doubt and whenever a suitable sub-sequence cannot be found.


next up previous
Next: Iterative multivariate surrogates Up: Fourier based surrogates Previous: Example: Southern oscillation index

Thomas Schreiber
Mon Aug 30 17:31:48 CEST 1999