Rainer Hegger |
|
Holger Kantz |
|
Thomas Schreiber |
Exercise 2 using TISEAN Nonlinear Time Series
Routines
Exercises using TISEAN
Part II: Linear models and simple prediction
Download the data set
amplitude.dat to
your local directory for use in this exercise (Press the "Shift"-key
and the left mouse button).
Visual analysis of data, time scales, and correlations
- Inspect the time series visually, e.g. by gnuplot (amount of data, obvious
artefacts, typical time scales, qualitative behaviour on short times)
- Compute the autocorrelation function (corr)
- Which is a reasonable order for an AR-model?
Use ar-model to fit AR-models
to the data.
Study the residuals, i.e. the differences between determinsitic part
of the AR-model and the next observations. Inside gnuplot:
plot [0:1000]'< ar-model amplitude.dat -p10'
u($0+10):1, '< ar-model amplitude.dat -p50' u
($0+50):1
Plot the data also in reversed order
(since one curve partly hides the
other), and together with amplitude.dat.
Read the description of ar-model to understand what you
see in the plot, and reduce and increase
the order of the model (controlled by the
-p option) as far as your patience allows
you to go (the computation time increases quadratically in
p).
- Result: the residuals have pronounced spikes at certain points of the time
series even for very large order of the model.
This demonstrates that the data do not stem from a linear
stochastic process. Nonetheless, their magnitude compared to the
amplitude of the signal is small. Hence,
if one wants to use a linear model,
p=10 is a reasonable compromise between
model complexity and performance.
- Now use ar-model to
produce a new time series:
ar-model -s5000 amplitude.dat -p10 -o,
the output in amplitude.dat.ar is now, with
the -s5000 option, the iterated model time
series of length 5000.
- Compare the two time series in the time domain.
Also, compute the histograms using the
routine histogram:
mycomputer> histogram amplitude.dat -b0
Using amplitude.dat as datafile, reading column 1
Use 5000 lines.
Writing to stdout
#interval of data: [-1.463000e+01:1.727000e+01]
#average= 1.463300e-01
#standard deviation= 7.994755e+00
The ar-data have zero mean by construction. If you wish to superimpose
the two histograms, you thus should shift the one with respect to the
other by the mean value of the data:
set data style histep
plot '< histogram amplitude.dat' u ($1-.146):2,'< histogram
ar.dat'
Result: The data sets are differnt: the distribution of ar.dat is closer to a Gaussian (and converges to a
Gaussian for longer time series, try plot '< ar-run -l100000
amplitude.dat.ar | histogram' ).
- Compute the auto-correlation functions and the power
spectra (by either mem_spec or
spectrum) of both of them:
corr amplitude.dat -D500 -o
corr ar.dat -D500 -o
set data style lines
plot 'ar.dat.cor','amplitude.dat.cor'
spectrum amplitude.dat -o
spectrum ar.dat -o
set logscale y
plot 'amplitude.dat_sp','ar.dat_sp'
Result: The AR-data contain the same temporal correlations, but they
decay much faster than in amplitude.dat.
The spectra have to be compared with both linear and logarithmic
y-scale. The frequency around 0.03 is dominant in both data sets, the
harmonics of that visible in amplitude.dat_sp
are suppressed in ar.dat_sp. This reflects
that the AR-model contains the relevant time scales, but has shortcomings
in a quantitative comparison. However, these are not too dramatic when
only viewed with second order statistics. The differences will be more
evident in the higher order correlations and other nonlinear concepts.
- Repeat the exercise starting from the ar-data you generated (file
ar.dat). You should observe that fitting an ar-model to ar-data will
yield residuals with a gaussian distribution, and that the
histograms, auto-correlation
functions and power spectra of the model data are identical to those
of the input data, if the order of the fit ( -p) is not smaller than the order of the model by
which the data were produced.
Embedding and time lags
- Visualize both amplitude.dat and ar.dat in a delay embedding (do
not forget to
reset the gnuplot, e.g., set nologs), using
delay :
Start with -d1 and increase it, at least up
to 50.
What is optimal by a) visual impression, and what should be
optimal when b) considering the auto-correlation function?
Answers:
amplitude.dat: a)
About 8, when unfolding is good but overlap is still small.
b) about 8: the first zero of the autocorrelation function
would be optimal for a harmonic, periodic signal embedded in 2
dimensions.
ar.dat: a) for delay 8, the shape of the blob of lines comes close to
circular, hence indicating sufficient decorrelation of the components
of the delay vectors. b) The auto-correlation function yields about
the same as for amplitude.dat.
Determinism and predictability
- compute the false nearest neighbour statistics
(false_nearest):
false_nearest amplitude.dat -M8 -d8 -o
-t200 -f5
Study the output, amplitude.dat.fnn, and
observe the invariance of the result (namely that the embedding dimension
3 is insufficient but 4 is o.k.) under change of the time lag.
- Use the zeorth-order predictor
(zeroth)
on amplitude.dat and on ar.dat.
zeroth amplitude.dat -m1,4 -d8 -o
-s250
zeroth ar.dat -m1,4 -d8 -o
-s250
plot
[][0:1.5] 'amplitude.dat.zer','ar.dat.zer',.05*exp(.02*x)
You should be able to verify the following observations:
For increasing prediction horizion, the prediction errors of amplitude.dat
show two regimes: Exponential increase of the error due to chaos
(the regime of nonlinear deterministic dynamics), slow linear
increase due to loss of phase locking (the regime of linear
correlations due to the rather constant period of the oscillations),
constant
when the predictions lose all correlations to the actual
values (limit of unpredictability for a large prediction horizon of more
time steps than can be computed with this data set,
the relative prediction error saturates at 1. In order to
arrive a prediction horizons larger than one half of the data set,
you must switch off the causality window by the -C0 option in zeroth).
No succesful prediction for ar.dat beyond the linear correlations.
Since ar.dat is a linear stochastic data set,
it does not contain phase space information.