Autoregressive integrated moving average

In statistics and econometrics, and in particular in time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. To better comprehend the data or to forecast upcoming series points, both of these models are fitted to time series data. ARIMA models are applied in some cases where data show evidence of non-stationarity in the sense of expected value (but not variance/autocovariance), where an initial differencing step (corresponding to the "integrated" part of the model) can be applied one or more times to eliminate the non-stationarity of the mean function (i.e., the trend).[1] When the seasonality shows in a time series, the seasonal-differencing[2] could be applied to eliminate the seasonal component. Since the ARMA model, according to the Wold's decomposition theorem,[3][4][5] is theoretically sufficient to describe a regular (a.k.a. purely nondeterministic[5]) wide-sense stationary time series, we are motivated to make stationary a non-stationary time series, e.g., by using differencing, before we can use the ARMA model.[6] Note that if the time series contains a predictable sub-process (a.k.a. pure sine or complex-valued exponential process[4]), the predictable component is treated as a non-zero-mean but periodic (i.e., seasonal) component in the ARIMA framework so that it is eliminated by the seasonal differencing.

The autoregressive (AR) part of ARIMA indicates that the evolving variable of interest is regressed on its own lagged (i.e., prior) values. The moving average (MA) part indicates that the regression error is actually a linear combination of error terms whose values occurred contemporaneously and at various times in the past.[7] The I (for "integrated") indicates that the data values have been replaced with the difference between their values and the previous values (and this differencing process may have been performed more than once). The purpose of each of these features is to make the model fit the data as well as possible.

Non-seasonal ARIMA models are generally denoted ARIMA(p,d,q) where parameters p, d, and q are non-negative integers, p is the order (number of time lags) of the autoregressive model, d is the degree of differencing (the number of times the data have had past values subtracted), and q is the order of the moving-average model. Seasonal ARIMA models are usually denoted ARIMA(p,d,q)(P,D,Q)m, where m refers to the number of periods in each season, and the uppercase P,D,Q refer to the autoregressive, differencing, and moving average terms for the seasonal part of the ARIMA model.[8][2]

When two out of the three terms are zeros, the model may be referred to based on the non-zero parameter, dropping "AR", "I" or "MA" from the acronym describing the model. For example, is AR(1), is I(1), and is MA(1).

ARIMA models can be estimated following the Box–Jenkins approach.

  1. ^ For further information on Stationarity and Differencing see https://www.otexts.org/fpp/8/1
  2. ^ a b Hyndman, Rob J; Athanasopoulos, George. 8.9 Seasonal ARIMA models. oTexts. Retrieved 19 May 2015. {{cite book}}: |website= ignored (help)
  3. ^ Hamilton, James (1994). Time Series Analysis. Princeton University Press. ISBN 9780691042893.
  4. ^ a b Papoulis, Athanasios (2002). Probability, Random Variables, and Stochastic processes. Tata McGraw-Hill Education.
  5. ^ a b Triacca, Umberto (19 Feb 2021). "The Wold Decomposition Theorem" (PDF). Archived (PDF) from the original on 2016-03-27.
  6. ^ Wang, Shixiong; Li, Chongshou; Lim, Andrew (2019-12-18). "Why Are the ARIMA and SARIMA not Sufficient". arXiv:1904.07632 [stat.AP].
  7. ^ Box, George E. P. (2015). Time Series Analysis: Forecasting and Control. WILEY. ISBN 978-1-118-67502-1.
  8. ^ "Notation for ARIMA Models". Time Series Forecasting System. SAS Institute. Retrieved 19 May 2015.