0

In order to proceed with a time series forecasting, the data set has to be stationary. Stationarity can be determined using a number of packages, the most famous (as far as I could understand) is statsmodel.

What I was not able to pick to date is when I have to use one method vs the other (additive or multiplicative).

Any simple explanation?

Andrea Moro
  • 351
  • 3
  • 12

2 Answers2

1

One possible way modeling time-series is as a three components process:
trend, seasonality and noise.

$X_t$ = M($TREND_t$, $SEASON_t$, $NOISE_t$).

Additive model assumes linear relationship, I.E:
$X_t$ = $TREND_t$ + $SEASON_t$ + $NOISE_t$.

Multiplicative model assumes cross relationship:
$X_t$ = $TREND_t$ * $SEASON_t$ * $NOISE_t$.

If data or prior suggests that the trend magnitude(or direction) affects noise or seasonality - or any other cross relation, it makes sense using a multiplicative model.

See this related question.

yoav_aaa
  • 1,003
  • 5
  • 13
1

You say you have managed to make your data stationary, so I would probably say an additive model is your best starting point. You could simply plot your stationary data and check that the variance doesn't increase with the nominal values. High variance will mean higher errors for a linear regression.

When we have a basic regression model, like the following:

$$ y = \beta_1x_1 + \beta_2x_2 + \epsilon $$

the residual error $\epsilon$ is hypothesized to be constant (assuming the model itself is accurate). $\epsilon$ should not get larger when there are larger values for the covariates $x_1$ $x_2$. In other words, you expect homoscedasticity: that the error term is the same across all values of the model covariates. This is what you will easily spot on the plot of your processed (stationary) data.

If you made you time series stationary by taking the logarithms (a.k.a differencing), then an additive model of the log-ed variables would almost correspond to a multiplicative model.

Just to be clear, if you still seem to have heteroscedasticity with $\epsilon$ varying greatly, this might imply that your model itself is ill-formed e.g. that an important factor is missing.

n1k31t4
  • 15,468
  • 2
  • 33
  • 52