I originally planned to write a long post about signals and noise, but I guess it is better to focus on tiny details and write a book later. Article New Tools for Analyzing Time
Series Relationships and Trends by J. C. Moore, A. Grinsted, and S. Jevrejeva (Eos, Vol. 86, No. 24, 14 June 2005) got some attention in David Stockwell’s blog, http://landshape.org/enm/rahmstorf-et-al-2007-ipcc-error/ . Very interesting article, but I’m afraid there’s something wrong with statements as
A wise choice of embedding dimension can be made with a priori insight or perhaps more commonly may be found by simply playing with the data.
Specially, Figure 3. of that article caught my eye:
Original Caption:Fig. 3. Nonlinear and linear trends in time series of mean sea level at Brest, France, for an embedding dimension equivalent to 30 years and an individual measurement standard error of 10 mm. The 95% confidence interval for the nonlinear fit is shaded and marked by the curved lines for the linear fit.
I found the data set from http://www.pol.ac.uk/psmsl/pubi/rlr.annual.data/190091.rlrdata , the only problem is that there are some missing values in this one. If anyone finds the full series, as in Fig 3. , pl. let me know. However, I can replicate this figure quite closely, linear trend:
..and ssatrend with 30 year embedding dimension:
With 10 mm measurement error, ssatrend outputs approx. 3 mm error at the endpoints and 1.5 at the middle. And as you can see from the original figure, indeed it seems that these errors are lower than linear trend errors. Moore:
The confidence interval of the nonlinear trend is usually much smaller than for a least squares fit, as the data are not forced to fit any specified set of basis functions.
But there’s one problem. When I got good match with the linear trend confidence limits, I used residuals to estimate the noise variance. Residual sum of squares divided by the degrees of freedom, you know that stuff. And then I just assumed that as a good estimate of the true variance, additive iid Gaussian noise over a trend. That’s how I got the match. If I’d use the 10 mm measurement noise, the confidence limits would be much more narrow:
Residual-based limits in black, 10 mm measurement noise based limits in red. The limits are actually more narrow than in the ‘non-linear trend’! That’s what I thought originally, if you have observation
and you apply a linear filter F,
and you define noise as F(y)-F(s), then more smoothing the better. In Wiener filtering, for example, the aim is to find F that minimizes F(y)-s. In linear least squares fit F(s)=s. In these climate papers F(s) vs. s seem to be not interesting. See also my CA comment . But this is something I’ll talk later, now I’d like to solve this Figure 3 issue.