Most time series encountered are not stationary.
Differencing The difference operator Ñ is defined as Ñ = 1-B, or, in other words, (ÑX)t = Xt - Xt-1.
Differencing can remove or reduce deterministic trends. In some cases
the investigator
may wish to difference again.
Differencing forms an integral part of the Box-Jenkins methodology.
If the dth difference of
X, ÑdX,
is an ARMA(p,q) process, then
X is said to be an
ARIMA(p,d,q) process.
If there is reason to believe that the underlying, deterministic trend takes a particular form, this can be regarded as a time-varying mean and subtracted from the process, to leave a random component which should be analysed separately.
Seasonal means If seasonal effect is fairly regular from year to year, a separate mean is estimated for each 'season'. The seasonal means are then subtracted from the observations and the residuals analysed as usual.
Seasonal differencing Slightly less stable than the method of seasonal means, but fits into the Box-Jenkins schema. The seasonal difference operator for monthly data is Ñ12 = 1 - B12.
The Box-Jenkins approach treats the reduction to stationarity as an integral part of the model fitting technique. MINITAB has the capability to carry out the extended model-fitting quite seamlessly.
Method of Moving Averages Replace Xt with a smoothed version of Xt which takes account of seasonal variation. For example, for quarterly data we might use
| (Xt-2 + 2Xt-1 + 2Xt + 2Xt+1 + Xt+2)/8 |
This method works well at eliminating seasonal variation, but of course it also smoothes out much of the underlying Time Series variation which we are hoping to detect. Different smoothing functions are sometimes used in an attempt to combat this effect.
Data transformations
We always assume that the process of
innovations has constant variance s2. If
this is not the case - for example, if
Var(et) seems to depend on the fitted value - then a
variance-stabilising transformation may be in order.
A common choice for such a transformation is the log function.
Variance-stabilisation is not the only reason for transformation. If data look very non-normal (skewed), a transformation may reduce the sample to normality.
Plot SACF, SPACF to see if a MA(q) or AR(p) model will fit. MINITAB draws dotted red lines to indicate which values appear to be significantly different from 0. If neither a pure MA nor a pure AR fits the bill, use ARMA(1,1).
Look at the series of residuals. Try a Normal probability plot, plot residuals against fitted values, find the SACF, SPACF of the residuals. If there is evidence that they are not all i.i.d. Normal(0,?), add a parameter to the model and try again.
| - 2 log Lmax + 2 (no. of parameters) = n log(SSR) + 2 (p + q) + const., |
This criterion only operates well when the data can be assumed Normal, and even then it tends to permit too many parameters. Akaike has also introduced the Bayesian Information Criterion, which is
| - 2 log Lmax + [1 + log(n)] (no. of parameters) |
| åck xt-k |
a = 0.2 is often used. The smoothing parameter may be estimated, but small variations tend not to make much difference.
Exponential smoothing is optimal if X is ARIMA(0,1,1). More advanced versions exist to cope with trends and seasonal variation.
The time series are measuring the same quantity: for example, where aircraft noise meters are set up at a number of locations. In this case we expect high correlation between the series.
Alternatively, they could all depend on some fundamental underlying quantity. Thus different forms of investment strategy will depend on the base lending rate.
The purpose of the investigation may be to uncover a causal relationship between two or more time series: one may be driving the other, possibly with a lag of a few time periods.
In practice most econometric time series are not I(0) (stationary), but
I(1) (integrated stationary: the sequence of first differences is stationary).
Investigating two I(1) processes using standard methods (see later) can give
unreliable results. The exception is when they are quite closely related, in the sense that
there is a stationary process Zt given by
Zt = uXt + vYt
for some constants u and v.
In this case X and
Y are said to be cointegrated.
To test whether X and
Y are cointegrated use ordinary least squares to find
a and b such that
Y = aX + b, then analyse
the residuals: if they are stationary, then it is reasonable to suppose that
X and Y are cointegrated.
When two series are cointegrated, the values of either one can be used to
forecast the future of both.
Of the form Xt = AXt-1 + Et for a VAR(1), where A is a matrix, the other terms random vectors. The Et are assumed independent Normal random variables with unchanging variance-covariance matrix. For convenience we assume that the expectation of Xt is zero.
Here EXtXt-1T = ASX, and one may deduce that
| EXtXt-kT = AkSX |
A may be estimated by means of the lag-1 cross-correlation function.
General VARMA processes may be treated similarly, although Moving Average components tend to make life more difficult. If VARIMA is sought, note that some of the components of the vector Xt may need to be differenced a different number of times than some of the others.
The frequency domain approach aims to find periodicities hidden in the data and to use them to predict future fluctuations. Frequency domain analysis predates the time domain.
(Wiener-Hinchin Theorem): if X is stationary, then there is an increasing function F such that
| gk = ò(0,p) cos kw dF(w). |
If X has no deterministic component then F is continuous, so differentiable: f = dF/dw is the spectral density function, or spectrum of X.
We have
| f(w) = | 1 p | (g0 + åk 2 gk cos wk) |
Spectral analysis involves using the data to produce an estimate for the spectrum of X, and from this deducing properties of X.
The obvious estimator to use is
| I(w) = | 1 p | (c0 + å1£k£N-1 2 ck cos wk) |
Unfortunately this estimator is inconsistent. There are a variety of ways of smoothing the estimator to try to produce something more useful.
The spectrum of the innovations process is s2/p.
A fundamental result is that, if X = y(B)Y (where B is the backshift operator) then
| fX(w) = y(eiw) y(e-iw) fY(w). |
Therefore the spectral density of a MA is
| f(w) = s2 f(eiw) f(e-iw)/p |
and of an AR is
| f(w) = | s2 p q(eiw) q(e-iw) |
ARMA processes can be handled similarly.
An algorithm called the Fast Fourier Transform enables the ACF to be calculated rapidly from the spectral density. In some cases this may be the quickest way of finding the ACF.
![]() |
![]() |