The STATESPACE Procedure

# Overview

The STATESPACE procedure analyzes and forecasts multivariate time series using the state space model. The STATESPACE procedure is appropriate for jointly forecasting several related time series that have dynamic interactions. By taking into account the autocorrelations among the whole set of variables, the STATESPACE procedure may give better forecasts than methods that model each series separately.

By default, the STATESPACE procedure automatically selects a state space model appropriate for the time series, making the procedure a good tool for automatic forecasting of multivariate time series. Alternatively, you can specify the state space model by giving the form of the state vector and the state transition and innovation matrices.

The methods used by the STATESPACE procedure assume that the time series are jointly stationary. Nonstationary series must be made stationary by some preliminary transformation, usually by differencing. The STATESPACE procedure allows you to specify differencing of the input data. When differencing is specified, the STATESPACE procedure automatically integrates forecasts of the differenced series to produce forecasts of the original series.

### The State Space Model

The state space model represents a multivariate time series through auxiliary variables, some of which may not be directly observable. These auxiliary variables are called the state vector. The state vector summarizes all the information from the present and past values of the time series relevant to the prediction of future values of the series. The observed time series are expressed as linear combinations of the state variables. The state space model is also called a Markovian representation, or a canonical representation, of a multivariate time series process. The state space approach to modeling a multivariate stationary time series is summarized in Akaike (1976).

The state space form encompasses a very rich class of models. Any Gaussian multivariate stationary time series can be written in a state space form, provided that the dimension of the predictor space is finite. In particular, any autoregressive moving average (ARMA) process has a state space representation and, conversely, any state space process can be expressed in an ARMA form (Akaike 1974). More details on the relation of the state space and ARMA forms are given in "Relation of ARMA and State Space Forms" later in this chapter.

Let xt be the r ×1 vector of observed variables, after differencing (if differencing is specified) and subtracting the sample mean. Let zt be the state vector of dimension s, s r, where the first r components of zt consist of xt. Let the notation represent the conditional expectation (or prediction) of xt+k based on the information available at time t. Then the last s - r elements of zt consist of elements of xt+k|t, where k>0 is specified or determined automatically by the procedure.

There are various forms of the state space model in use. The form of the state space model used by the STATESPACE procedure is based on Akaike (1976). The model is defined by the following state transition equation:

zt+1 = F zt + G et+1

In the state transition equation, the s ×s coefficient matrix F is called the transition matrix; it determines the dynamic properties of the model.

The s ×r coefficient matrix G is called the input matrix; it determines the variance structure of the transition equation. For model identification, the first r rows and columns of G are set to an r ×r identity matrix.

The input vector et is a sequence of independent normally distributed random vectors of dimension r with mean 0 and covariance matrix .The random error et is sometimes called the innovation vector or shock vector.

In addition to the state transition equation, state space models usually include a measurement equation or observation equation that gives the observed values xt as a function of the state vector zt. However, since PROC STATESPACE always includes the observed values xt in the state vector zt, the measurement equation in this case merely represents the extraction of the first r components of the state vector.

The measurement equation used by the STATESPACE procedure is

xt = [ Ir 0 ] zt

where Ir is an r ×r identity matrix. In practice, PROC STATESPACE performs the extraction of xt from zt without reference to an explicit measurement equation.

In summary:

xt
is an observation vector of dimension r.

zt
is a state vector of dimension s, whose first r elements are xt and whose last s-r elements are conditional prediction of future xt.

F
is an s×s transition matrix.

G
is an s×r input matrix, with the identity matrix Ir forming the first r rows and columns.

et
is a sequence of independent normally distributed random vectors of dimension r with mean 0 and covariance matrix .

### How PROC STATESPACE Works

The design of the STATESPACE procedure closely follows the modeling strategy proposed by Akaike (1976). This strategy employs canonical correlation analysis for the automatic identification of the state space model.

Following Akaike (1976), the procedure first fits a sequence of unrestricted vector autoregressive (VAR) models and computes Akaike's information criterion (AIC) for each model. The vector autoregressive models are estimated using the sample autocovariance matrices and the Yule-Walker equations. The order of the VAR model producing the smallest Akaike information criterion is chosen as the order (number of lags into the past) to use in the canonical correlation analysis.

The elements of the state vector are then determined via a sequence of canonical correlation analyses of the sample autocovariance matrices through the selected order. This analysis computes the sample canonical correlations of the past with an increasing number of steps into the future. Variables that yield significant correlations are added to the state vector; those that yield insignificant correlations are excluded from further consideration. The importance of the correlation is judged on the basis of another information criterion proposed by Akaike. See the section "Canonical Correlation Analysis" for details. If you specify the state vector explicitly, these model identification steps are omitted.

Once the state vector is determined, the state space model is fit to the data. The free parameters in the F, G, and matrices are estimated by approximate maximum likelihood. By default, the F and G matrices are unrestricted, except for identifiability requirements. Optionally, conditional least-squares estimates can be computed. You can impose restrictions on elements of the F and G matrices.

After the parameters are estimated, forecasts are produced from the fitted state space model using the Kalman filtering technique. If differencing was specified, the forecasts are integrated to produce forecasts of the original input variables.