VARMAX models

This is a brief introduction notebook to VARMAX models in Statsmodels. The VARMAX model is generically specified as: $$ y_t = \nu + A_1 y_{t-1} + \dots + A_p y_{t-p} + B x_t + \epsilon_t + M_1 \epsilon_{t-1} + \dots M_q \epsilon_{t-q} $$

where $y_t$ is a $\text{k_endog} \times 1$ vector.

In [1]:
%matplotlib inline
In [2]:
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
The history saving thread hit an unexpected error (OperationalError('attempt to write a readonly database')).History will not be written to the database.
/opt/build/statsmodels-0.8.0/.pybuild/cpython3_3.7_statsmodels/build/statsmodels/compat/pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead.
  from pandas.core import datetools
In [3]:
dta = sm.datasets.webuse('lutkepohl2', 'http://www.stata-press.com/data/r12/')
dta.index = dta.qtr
endog = dta.ix['1960-04-01':'1978-10-01', ['dln_inv', 'dln_inc', 'dln_consump']]
---------------------------------------------------------------------------
ConnectionRefusedError                    Traceback (most recent call last)
/usr/lib/python3.7/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
   1323                 h.request(req.get_method(), req.selector, req.data, headers,
-> 1324                           encode_chunked=req.has_header('Transfer-encoding'))
   1325             except OSError as err: # timeout error

/usr/lib/python3.7/http/client.py in request(self, method, url, body, headers, encode_chunked)
   1243         """Send a complete request to the server."""
-> 1244         self._send_request(method, url, body, headers, encode_chunked)
   1245 

/usr/lib/python3.7/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
   1289             body = _encode(body, 'body')
-> 1290         self.endheaders(body, encode_chunked=encode_chunked)
   1291 

/usr/lib/python3.7/http/client.py in endheaders(self, message_body, encode_chunked)
   1238             raise CannotSendHeader()
-> 1239         self._send_output(message_body, encode_chunked=encode_chunked)
   1240 

/usr/lib/python3.7/http/client.py in _send_output(self, message_body, encode_chunked)
   1025         del self._buffer[:]
-> 1026         self.send(msg)
   1027 

/usr/lib/python3.7/http/client.py in send(self, data)
    965             if self.auto_open:
--> 966                 self.connect()
    967             else:

/usr/lib/python3.7/http/client.py in connect(self)
    937         self.sock = self._create_connection(
--> 938             (self.host,self.port), self.timeout, self.source_address)
    939         self.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

/usr/lib/python3.7/socket.py in create_connection(address, timeout, source_address)
    726     if err is not None:
--> 727         raise err
    728     else:

/usr/lib/python3.7/socket.py in create_connection(address, timeout, source_address)
    715                 sock.bind(source_address)
--> 716             sock.connect(sa)
    717             # Break explicitly a reference cycle

ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

URLError                                  Traceback (most recent call last)
<ipython-input-3-3483d3cc46f9> in <module>()
----> 1 dta = sm.datasets.webuse('lutkepohl2', 'http://www.stata-press.com/data/r12/')
      2 dta.index = dta.qtr
      3 endog = dta.ix['1960-04-01':'1978-10-01', ['dln_inv', 'dln_inc', 'dln_consump']]

/opt/build/statsmodels-0.8.0/.pybuild/cpython3_3.7_statsmodels/build/statsmodels/datasets/utils.py in webuse(data, baseurl, as_df)
     47 
     48     url = urljoin(baseurl, data+'.dta')
---> 49     dta = urlopen(url)
     50     dta = BytesIO(dta.read())  # make it truly file-like
     51     if as_df:  # could make this faster if we don't process dta twice?

/usr/lib/python3.7/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    220     else:
    221         opener = _opener
--> 222     return opener.open(url, data, timeout)
    223 
    224 def install_opener(opener):

/usr/lib/python3.7/urllib/request.py in open(self, fullurl, data, timeout)
    523             req = meth(req)
    524 
--> 525         response = self._open(req, data)
    526 
    527         # post-process response

/usr/lib/python3.7/urllib/request.py in _open(self, req, data)
    541         protocol = req.type
    542         result = self._call_chain(self.handle_open, protocol, protocol +
--> 543                                   '_open', req)
    544         if result:
    545             return result

/usr/lib/python3.7/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    501         for handler in handlers:
    502             func = getattr(handler, meth_name)
--> 503             result = func(*args)
    504             if result is not None:
    505                 return result

/usr/lib/python3.7/urllib/request.py in http_open(self, req)
   1350 
   1351     def http_open(self, req):
-> 1352         return self.do_open(http.client.HTTPConnection, req)
   1353 
   1354     http_request = AbstractHTTPHandler.do_request_

/usr/lib/python3.7/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
   1324                           encode_chunked=req.has_header('Transfer-encoding'))
   1325             except OSError as err: # timeout error
-> 1326                 raise URLError(err)
   1327             r = h.getresponse()
   1328         except:

URLError: <urlopen error [Errno 111] Connection refused>

Model specification

The VARMAX class in Statsmodels allows estimation of VAR, VMA, and VARMA models (through the order argument), optionally with a constant term (via the trend argument). Exogenous regressors may also be included (as usual in Statsmodels, by the exog argument), and in this way a time trend may be added. Finally, the class allows measurement error (via the measurement_error argument) and allows specifying either a diagonal or unstructured innovation covariance matrix (via the error_cov_type argument).

Example 1: VAR

Below is a simple VARX(2) model in two endogenous variables and an exogenous series, but no constant term. Notice that we needed to allow for more iterations than the default (which is maxiter=50) in order for the likelihood estimation to converge. This is not unusual in VAR models which have to estimate a large number of parameters, often on a relatively small number of time series: this model, for example, estimates 27 parameters off of 75 observations of 3 variables.

In [4]:
exog = endog['dln_consump']
mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(2,0), trend='nc', exog=exog)
res = mod.fit(maxiter=1000)
print(res.summary())
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-4-d27bc18e74bd> in <module>()
----> 1 exog = endog['dln_consump']
      2 mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(2,0), trend='nc', exog=exog)
      3 res = mod.fit(maxiter=1000)
      4 print(res.summary())

NameError: name 'endog' is not defined

From the estimated VAR model, we can plot the impulse response functions of the endogenous variables.

In [5]:
ax = res.impulse_responses(10, orthogonalized=True).plot(figsize=(13,3))
ax.set(xlabel='t', title='Responses to a shock to `dln_inv`');
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-5-5a59fc5576cf> in <module>()
----> 1 ax = res.impulse_responses(10, orthogonalized=True).plot(figsize=(13,3))
      2 ax.set(xlabel='t', title='Responses to a shock to `dln_inv`');

NameError: name 'res' is not defined

Example 2: VMA

A vector moving average model can also be formulated. Below we show a VMA(2) on the same data, but where the innovations to the process are uncorrelated. In this example we leave out the exogenous regressor but now include the constant term.

In [6]:
mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(0,2), error_cov_type='diagonal')
res = mod.fit(maxiter=1000)
print(res.summary())
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-6-1e65926ed8c3> in <module>()
----> 1 mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(0,2), error_cov_type='diagonal')
      2 res = mod.fit(maxiter=1000)
      3 print(res.summary())

NameError: name 'endog' is not defined

Caution: VARMA(p,q) specifications

Although the model allows estimating VARMA(p,q) specifications, these models are not identified without additional restrictions on the representation matrices, which are not built-in. For this reason, it is recommended that the user proceed with error (and indeed a warning is issued when these models are specified). Nonetheless, they may in some circumstances provide useful information.

In [7]:
mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(1,1))
res = mod.fit(maxiter=1000)
print(res.summary())
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-13eb8b06c1e5> in <module>()
----> 1 mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(1,1))
      2 res = mod.fit(maxiter=1000)
      3 print(res.summary())

NameError: name 'endog' is not defined