notes fpp
Notes for Forecasting
packages
forecast
tseries
fma
: ts dataexpsmooth
: more ts datalmtest
: some regression functions
data
beer <- window(ausbeer, start = 1992)
Simple forecasting methods
Average method:
meanf(x, h = 20)
Naive method
naive(x, h = 20)
rwf(x, h= 20)
future equals the last one
\[y\_{T+h|T} = y\_T\]seasonal naive method
snaive(x, h = 20)
Drift method:
Equivalent to extrapolating a line drawn between first and last observations.
rwf(x, drift = TRUE, h = 20)
Measures of forecast accuracy
\begin{eqnarray}
MAE & = & T^{-1} \sum_{t=1}^{T} abs(y_t - yhat(t|t-1)) \
MSE & = & T^{-1} \sum_{t=1}^{T} (y_t - yhat(t|t-1))^2 \
MAPE & = & 100T^{-1} \sum_{t=1}^{T} |(y_t - yhat(t|t-1)|/|y_t|
\end{eqnarray}
MAPE is scale independent.
Mean absolute scaled error (MASE)
Training set vs test set
80% vs 20%
beer3 <- window(ausbeer,start=1992,end=2005.99)
beer4 <- window(ausbeer,start=2006)
fit1 <- meanf(beer3,h=20)
fit2 <- rwf(beer3,h=20)
accuracy(fit1,beer4)
accuracy(fit2,beer4)
Accuracy measures computed for errors in test set only.
Time series graphics
time plot: plot, plot.ts
seasonal: seasonplot(beer, year.labels=TRUE)
seasonal subseries: monthplot
lag plot: lag.plot(beer, lags = 9)
acf: Acf
Autocorrelation
\begin{eqnarray}
c_k & = 1/T * \sum_{t = k+1}^T (y_t - ybar)(y_{t-k} - ybar) \
r_k & = c_k /c_0
\end{eqnarray}
Can check $r_1$ up to $r_k$ to detect seasonal pattern
Acf(beer)
Slowly decaying ACF indicates trend
ACF peaks at lags 12, 24, 36 indicate seasonality of length 12
White Noise
Example of pigs slaughtered:
- Difficult to detect pattern in time plot.
- ACF shows some significant autocorrelation at lags 1, 2, and 3.
- r12 relatively large although not significant. This may indicate some slight seasonality.
These show the series is not a white noise series.
Dow-Jones naive forecasts revisited
\[yhat\_{t|t-1} = y\_{t-1}\] \[e\_t = y\_t - y\_{t-1}\]check trace plot of $e_t$ and ACF
Pormanteau tests
Box-Pierce test
\[Q = T \sum\_{k=1}^h r\_k^2\]h=10
for non-seasonal data, $h=2m$ for seasonal data, is max lag being consideredT
is number of observations- if $r$ close to zero, then Q will be small
- if some $r$ is large, then Q will be large
Ljung-Box test
\[Q^* = T(T+2) \sum\_{k=1}^h (T-k)^{-1} r\_k^2\]better performance, especially in small samples.
If white noise, $Q^*$ has chi-square distribution with $(h-K)$ df where $K$ is number of parameters in model(for raw data, K=0)
Box.test(res, lag = 10, fitdf = 0)
Box.test(res, lag = 10, fitdf = 0, type = "Lj")
test residual
beer <- window(ausbeer,start=1992)
fc <- snaive(beer)
res <- residuals(fc)
Acf(res)
Box.test(res, lag=8, fitdf=0, type="Lj")
Time series decomposition
\[Y\_t = f(S\_t, T\_t, E\_t)\]If the variation due to seasonality is not of primary interest, the seasonally adjusted series can be useful. Consequently, employment data (and many other economic series) are usually seasonally adjusted.
$S_t$: seasonal component
$T_t$: trend-cycle component
STL
stl()
decompose()
plot(decompose(hsales))
plot(stl(hsales,s.window="periodic"))
plot(stl(hsales,s.window=15))
fit = stl(data, s.window=5, t.window=15, s.window = "periodic")
plot(fit)
t.window: controls wiggliness of trend
s.window: controls variation in seasonal component
Seasonal adjustment
plot(hsales,col="gray")
fit <- stl(hsales,s.window=15)
hsales.sa <- seasadj(fit)
lines(hsales.sa, col="red")
fit <- stl(elecequip, t.window=15,
s.window="periodic", robust=TRUE)
eeadj <- seasadj(fit)
plot(naive(eeadj), xlab="New orders index")
fcast <- forecast(fit, method="naive")
plot(fcast, ylab="New orders index")
Exponential smoothing
ses(x) : simple exponential smoothing, no trend
holt(x): Hold's method: linear trend
holt(x, exponential = TRUE): exponential trend method
holt(x, damped = TRUE): damped trend method
holt(x, damped = TRUE, exponential = TRUE)
simple exponential smoothing
random walk forecasts
\[y\_{T+1|T} = y\_T\]average
\[y\_{T+1|T} = 1/T * \sum\_{t=1}^T y\_t\]simple exponential smoothing
\[y\_{T+1|T} = \alpha y\_T + \alpha(1-\alpha) y\_{T-1} + \alpha(1-\alpha)^2 y\_{T-2} +\]weighted average form
\[y\_{T+1|T} = \alpha y\_T + \alpha(1-\alpha) y\_{T-1}\]Exponentially weighted average:
\[y\_{T+1|T} = \sum\_{j=0}^{T-1} \alpha(1-\alpha)^j y\_{T-j} + (1-\alpha)^T l\_0\]Common to set $l_0 = y_1$
optimization
can choose $\alpha$ and $l_0$ by minimizing MSE:
\[MSE = 1/(T-1) \sum\_{t=2}^T (y\_t- y\_{t|t-1})^2\]No closed form, use numerical result
SES in R
fit1 <- ses(oildata, alpha=0.2,
initial="simple", h=3)
fit2 <- ses(oildata, alpha=0.6,
initial="simple", h=3)
fit3 <- ses(oildata, h=3)
accuracy(fit1)
accuracy(fit2)
accuracy(fit3)
Non-seasonal trend methods
Holt’s local trend method
\begin{eqnarray}
yhat_{t+h|t}& = l_t + h*b_t \
l_t &= \alpha*y_t + (1 - \alpha)(l_{t-1} + b_{t-1})\
b_t &= \beta^* (l_t - l_{t-1}) + (1 - \beta^*) b_{t-1}
\end{eqnarray}
find $\alpha$ and $\beta^*$ by minimizing the MSE
fit1 <- holt(strikes)
plot(fit1$model)
plot(fit1, plot.conf=FALSE)
lines(fitted(fit1), col="red")
fit1$model
fit2 <- ses(strikes)
plot(fit2$model)
plot(fit2, plot.conf=FALSE)
lines(fit1$mean, col="red")
accuracy(fit1)
accuracy(fit2)
Exponential one
\begin{eqnarray}
y_{t+h|t| = l_t + h* b_t \
l_t = \alpha y_t + (1 − \alpha )(l_t−1b_{t−1})\
b_t =\beta ∗ (l_t/l_{t−1})+(1−\beta∗)b_{t−1}
\end{eqnarray}
bt : relative growth
damped trend
\begin{eqnarray}
yhat_{t+h|t} &= l_t + (\phi + \phi^2 + … + \phi^(h-1))*b_t \
l_t &= \alpha*y_t + (1 - \alpha)(l_{t-1} + \phi* b_{t-1}) \
b_t &= \beta^* (l_t - l_{t-1}) + (1 - \beta^{*}) \phi* b_{t-1}
\end{eqnarray}
phi damped the trend so it approaches a constant
Damped trend method often gives better forecasts than linear trend.
Holt-Winters
- capture seasonality
additive method
\begin{eqnarray}
y(t+h|t) &=& l_t + hb_t + s_{t−m+h^+m} \
l_t &=& \alpha (y_t − s_{t−m}) + (1 − \alpha )(l_{t−1} + b_{t−1}) \
b_t &=& \beta∗(l_t − l_{t−1}) + (1 − \beta∗)b_{t−1} \
s_t &=& γ(yt −l_{t−1} −b_{t−1})+(1−γ)s_{t−m} \
h_m^+ &=& ⌊(h−1) mod m⌋+1
\end{eqnarray}
Multiplicative model
\begin{eqnarray}
y(t+h|t) &=& (l_t + hb_t)s_{t−m+h^+m} \
l_t &=& \alpha (y_t/s_{t−m}) + (1 − \alpha )(l_{t−1} + b_{t−1}) \
b_t &=& \beta∗(l_t − l_{t−1}) + (1 − \beta∗)b_{t−1} \
s_t &=& γ(yt/(l_{t−1}+b_{t−1}))+(1−γ)s_{t−m} \
h_m^+ &=& ⌊(h−1) mod m⌋+1
\end{eqnarray}
optimze
alpha, beta*, gamma, l_0, b_0, s_0, s_-1, ..., s_{1-m}
Damped Holt-winters Multiplicative method
\begin{eqnarray}
y(t+h|t) &=& (l_t + (1+phi+…+phi^{h-1})b_t)s_{t−m+h^+m} \
l_t &=& \alpha (y_t/s_{t−m}) + (1 − \alpha )(l_{t−1} + b_{t−1}) \
b_t &=& \beta∗(l_t − l_{t−1}) + (1 − \beta∗)b_{t−1} \
s_t &=& γ(yt/(l_{t−1}+b_{t−1}))+(1−γ)s_{t−m} \
h_m^+ &=& ⌊(h−1) mod m⌋+1
\end{eqnarray}
ETS(Error, Trend, Seasonal)
Trend | N | A | M |
---|---|---|---|
N | N,N | N,A | N,M |
A(additive) | A,N | A,A | A,M |
Ad(additive damped) | Ad,N | Ad,A | Ad,M |
M(multiplicative) | M,N | M,A | M,M |
Md(multiplicative damped) | Md,N | Md,A | Md,M |
R functions
ses(): (N,N)
holt(): (A, N), (Ad, N), (M,N), (Md,N)
hw(): (A,A), (Ad, A), (A,M), (Ad,M), (M,M), (Md,M)
Exponential smoothing state space models (forecast
package)
Previous models do not use the likelihood.
State space models adopt distributional assumptions and can be estimated by MLE. Can alos do model comparison by AIC.
Showed All ES methods (including non-linear methods) are optimal forecasts from innovations state space models. See Hyndman et al (2008) for a comprehensive survey
ETS(A,N,N): simple exponential smoothing with additive errors
ETS(A,A,N): Holt's linear method with additive errors
ETS(A,A,A): additive Holt-winters method with additive errors
ETS(M,A,M): Multiplicative Holt-winters' method with multiplicative errors
ETS(A,Ad,N): damped trend method with additive errors
Innovative state space models
- All the methods can be written in this state space form.
- Prediction intervals can be obtained by simulating many future sample paths.
- For many models, the prediction intervals can be obtained analytically as well.
- Additive and multiplicative versions give the same point forecasts.
- Estimation is handled via maximizing the likelihood of the data given the model.
AIC
AIC = -2 log(L) + 2p
AIC corrected( for small sample bias)
AIC = AIC + 2(p+1)(p+2)/(n-p)
- A difference in AIC values of 2 or less is not regarded as substantial and you may choose the simpler but non-optimal model.
BIC
BIC = -2 log(L) + p*log(n)
Procedures
From Hyndman et al (2008):
- Apply each of 30 methods that are appropriate to the data. Estimate parameters and initial values using MLE.
- Select best method using AIC. Produce forecasts using best method.
- Obtain prediction intervals using underlying state space model.
Sample R code:
fit <- ets(ausbeer)
fit2 <- ets(ausbeer,model="AAA",damped=FALSE)
fcast1 <- forecast(fit, h=20)
fcast2 <- forecast(fit2, h=20)
ets(y, model="ZZZ", damped=NULL, alpha=NULL,
beta=NULL, gamma=NULL, phi=NULL,
additive.only=FALSE,
lower=c(rep(0.0001,3),0.80),
upper=c(rep(0.9999,3),0.98),
opt.crit=c("lik","amse","mse","sigma"), nmse=3,
bounds=c("both","usual","admissible"),
ic=c("aic","aicc","bic"), restrict=TRUE)
ets()
function:
- Automatically chooses a model by default using the AIC, AICc or BIC.
- Can handle any combination of trend, seasonality and damping
- Produces prediction intervals for every model
- Ensures the parameters are admissible (equivalent to invertible)
- Produces an object of class ets.
functions
plot(fit)
accuracy(fit)
forecast
returns forecasts when applied to an ets object (or the output from many other time series models).- If you use
forecast
directly on data, it will select an ETS model automatically and then return forecasts.
Basic Steps in a forecasting task
https://www.otexts.org/fpp/1/6
- preliminary analysis: always start by graphing the data, see chapter 2 and 6
- consistent pattern ?
- trend?
- seasonality?
- business cycles?
- outliers?
- choose and fit models
-
regression model (chapter 4 and 5)
- exponential smoothing model (chapter 7)
- box-jenkins ARIMA model (chapter 8)
- others like dynamic regression models, neural networks (chapter 9)
-
- use and evaluate model
Plot
- time plots
- trend
- seasonal fixed and known length
-
cycle (chapter 6) unknown length, usually longer than that of seasonality
first to identify the time series patterns in the data, then choose a method that is able to capture the patterns properly
seasonplot monthplot
- scatterplot
Statistics
-
ACF: check r4: seasonal pattern
we expect 95% of the spikes in the ACF to lie within ±2/sqrt(T) where T is the length of the time series. It is common to plot these bounds on a graph of the ACF. If there are one or more large spikes outside these bounds, or if more than 5% of spikes are outside these bounds, then the series is probably not white noise.
forecast
package
meanf()
naive(), snaive()
rwf()
croston()
stlf()
ses()
holt(), hw()
splinef()
thetaf()
forecast()
Regression
fit.ex4 <- tslm(austa ~ trend)
f <- forecast(fit.ex4, h=5,level=c(80,95))
plot(f, ylab="International tourist arrivals to Australia (millions)", xlab="t")
lines(fitted(fit.ex4),col="blue")
summary(fit.ex4)
- outlier: use a dummy variable to remove the effect rather than omit it
- public holidays: dummy variable
- Easter: dummy variable for that month
- Ex post and ex ante forecasting
- intervention variables: spike variable or change of slope
- selecting predictors:
- not recommended to remove variables not showing significant in scatterplot
-
not valid to disregard all variables whose p value greater than 0.05
CV(fit)
-
scatterplots of residuals against predictors
if show a pattern, then non linear form may be needed
-
Scatterplot of residuals vs fitted value
If showing a pattern, there may be heteroscedasticity. Transformation of forecast variable may be needed.
-
test autocorrelation, should be arround 2
dwtest(fit, alt = "two.sided") bgtest(fit, 5) # test for autocorrelations up to lag 5
forecast()
function
- Takes a time series or time series model as its main argument
- If first argument is class
ts
, returns forecasts from automatic ETS algorithm. - Methods for objects of class
Arima
,ar
,HoltWinters
,StructTS
, etc. -
Output as class
forecast
.forecast(y)
Confouding
Confounding is not really a problem for forecasting, as we can still compute forecasts without needing to separate out the effects of the predictors. However, it becomes a problem with scenario forecasting as the scenarios should take account of the relationships between predictors. It is also a problem if some historical analysis of the contributions of various predictors is required
Multicollinearity
- perfect correlation: not possible to estimate the model
- high correlation: difficult in estimation
- if uncertainty is large, statistical tests (e.g., t-test) are unreliable, also not posible to make accurate statement about the contribution of each separate predictor to the forecast
Transformation
If the data show different variation at different levels of the series, then a transformation can be useful.
Logarithms are useful
Box-Cox transformation
- if some yt < 0, no power transformation is possible unless all yt adjusted by adding a constant to all values.
-
If some yt=0, then must have λ>0
- A Box-Cox transformation followed by an additive ETS model is often better than an ETS model without transformation.
- A Box-Cox transformation followed by STL + ETS is often better than an ETS model without transformation.
-
It makes no sense to use a Box-Cox transformation and a non-additive ETS model.
lambda = BoxCox.lambda(y) : can choose a value for lambda plot(BoxCox(y, lambda))
Stationarity
- for all s, the distribution of (Yt, …, Y(t+s)) does not depend on t
- roughly horizontal
- constant variance
- no patterns predictable in the long-term
to identify non-stationary series
- time plot
- acf drop to zero quite quickly
- acf of non-stationary decreases slowly
- r1 is often large and positve for non-stationary data
Random walk model
\[y\_t = y\_{t-1} + e\_t\]Random walk with drift model
\[y\_t = y\_{t-1} + e\_t + c\]c is the average change between consecutive observations
- In practice, it is almost never necessary to go beyond second-order differences.
Seasonal differencing
\[y\_{t}^{\prime} = y\_{t} - y\_{t-m}\]For monthly data, m= 12; for quarterly, m = 4.
plot(diff(data, 12))
automated differencing
ndiffs(x)
nsdiffs(x)
Non-seasonal ARIMA models
Autoregressive model (AR)
\[y\_{t} = \phi\_1 y\_{t-1} + ... + \phi\_p y\_{t-p} + \epsilon\_{t}\]A multiple regression with lagged values
AR(1)
\[y\_{t} = c + \phi\_1 y\_{t-1} + \epsilon\_{t}\]- when \phi_1 = 0, equivalent to white noise
- when \phi = 1, and c = 0, equivalent to a random walk
- when \phi = 1, and c \neq 0, equivalent to a random walk with drift
- when \phi < 0, oscillate between positive and negative values
Moving Average (MA) models
\[y\_{t} = c + \epsilon\_{t} + \theta\_1 \epsilon\_{t-1} + ... + \theta\_q \epsilon\_{t-q}\]ARIMA models
\begin{eqnarray}
y_{t} & = c + \phi_1 y_{t-1} + … + \phi_p y_{t-p} \
& + \theta_1 \epsilon_{t-1} + … + \theta_q \epsilon_{t-q}
\end{eqnarray}
- include lagged values and lagged errors
- used for a huge range or stationary time series
- short-term dynamics
ARIMA(p, d, q)
AR: p = order of the autoregressive part
I: d = degree of first differencing involved
MA: q = order of the moving average part
- White noise: ARIMA(0,0,0)
- RW: ARIMA(0,1,0) with no constant
- RW with drift : ARIMA(0,1,0) with constant
- AR(p): ARIMA(p,0,0)
- MA(q): ARIMA(0,0,q)
Code
> fit <- auto.arima(usconsumption[,1], seasonal=FALSE)
ARIMA(0,0,3) with non-zero mean
Coefficients:
ma1 ma2 ma3 intercept
0.2542 0.2260 0.2695 0.7562
s.e. 0.0767 0.0779 0.0692 0.0844
sigma^2 estimated as 0.3856: log likelihood=-154.73
AIC=319.46 AICc=319.84 BIC=334.96
yt = 0.756 + et + 0.254et−1 + 0.226et−2 + 0.269et−3,
MLE
- Estimate $c, \phi_1, …, \phi_p, \theta_1, …, \theta_q$
- MLE is similar to least squares
- non-linear optimization
- different software will give different estimates
Step
Hyndman and Khandakar (JSS, 2008):
- choose
d
via unit root tests - choose
p, q
by minimising AIC - stepwise search to traverse model space
Step 1: Select current model (with smallest AIC) from: ARIMA(2, d, 2)
ARIMA(0, d, 0)
ARIMA(1, d, 0)
ARIMA(0, d, 1)
Consider variations of current model:
- vary one of p, q, from current model by ±1 • p, q both vary from current model by ±1
- Include/exclude c from current model
Model with lowest AICc becomes current model.
Step 2:
Repeat Step 2 until no lower AICc can be found.
Prediction intervals
- Prediction intervals increase in size with forecast horizon.
- Prediction intervals can be difficult to calculate by hand
- Calculations assume residuals are uncorrelated and normally distributed. Prediction intervals tend to be too narrow.
- the uncertainty in the parameter estimates has not been accounted for.
- the ARIMA model assumes historical patterns will not change during the forecast period.
- the ARIMA model assumes uncorrelated future errors
Backshift notation
- First difference: 1 - B
- Double difference: (1-B)^2
- dth-order difference: (1-B)^d yt
- seasonal difference: 1 - B^m
ARIMA model
ARIMA(1,1,1):
(1-phi_1 B)(1-B) y_t = c + (1 + theta_1 B) e_t
AR(1) first MA(1)
difference
Seasonal ARIMA models:
ARIMA(p,d,q)(P,D,Q)m
non seasonal
seasonal
m: number of periods per season
> auto.arima(euretail)
ARIMA(1,1,1)(0,1,1)[4]
Coefficients:
ar1 ma1 sma1
0.8828 -0.5208 -0.9704
s.e. 0.1424 0.1755 0.6792
sigma^2 estimated as 0.1411: log likelihood=-30.19
AIC=68.37 AICc=69.11 BIC=76.68
auto.arima(euretail, stepwise=FALSE,
approximation=FALSE)
fit <- auto.arima(h02, lambda=0)
> fit
ARIMA(2,1,3)(0,1,1)[12]
Box Cox transformation: lambda= 0
Coefficients:
ar1 ar2 ma1 ma2 ma3 sma1
-1.0194 -0.8351 0.1717 0.2578 -0.4206 -0.6528
s.e. 0.1648 0.1203 0.2079 0.1177 0.1060 0.0657
sigma^2 estimated as 0.004071: log likelihood=250.8
AIC=-487.6 AICc=-486.99 BIC=-464.83
- Myth that ARIMA models are more general than exponential smoothing.
- Linear exponential smoothing models all special cases of ARIMA models.
- Non-linear exponential smoothing models have no equivalent ARIMA counterparts.
- Many ARIMA models have no exponential smoothing counterparts.
- ETS models all non-stationary. Models with seasonality or non-damped trend (or both) have two unit roots; all other models have one unit root.
Equivalences
- Simple exponential smoothing
- Forecasts equivalent to ARIMA(0,1,1).
- Parameters: θ1 = \alpha − 1.
- Holt’s method
- Forecasts equivalent to ARIMA(0,2,2).
- Parameters: θ1 =\alpha +\beta−2andθ2 =1−\alpha .
- Damped Holt’s method
- Forecasts equivalent to ARIMA(1,1,2).
- Parameters: \phi 1 =\phi ,θ1 =\alpha +\phi \beta −2,θ2 =(1−\alpha )\phi .
- Forecasts equivalent to ARIMA(1,1,2).
- Holt-Winters’ additive method
- Forecasts equivalent to ARIMA(0,1,m+1)(0,1,0)m.
- Parameter restrictions because ARIMA has m + 1 parameters whereas HW uses only three parameters.
- Holt-Winters’ multiplicative method
- No ARIMA equivalence
Dynamic Regression
Fourier
fit <- auto.arima(y, seasonal=FALSE,
xreg=fourier(y, K=3))
Forecast a regression model with ARIMA errors
- forecast the regression part
- forecast the ARIMA part
- combine
- when future predictors are unknown, they need to be forecast first
- forecasts of macroeconomic variables may be obtained from the national statistical officies
- separate forecasting models may be needed for other explanatory variables
- some explanatory variables are known for the future (e.e., time, dummies)
Cross-validation
- does not work normally for time series because we cannot use future observations to build a model
Example
- Linear model with trend and seasonal dummies applied to log data
- ARIMA model applied to log data
- ETS model applied to original data
which model is best
- set
k=48
as minimum training set - forecast 12 steps ahead
- compare MAE for each forecast horizon
Code:
k <- 48
n <- length(a10)
mae1 <- mae2 <- mae3 <- matrix(NA,n-k-12,12)
for(i in 1:(n-k-12))
{
xshort <- window(a10,end=1995+(5+i)/12)
xnext <- window(a10,start=1995+(6+i)/12,end=1996+(5+i)/12)
fit1 <- tslm(xshort ~ trend + season, lambda=0)
fcast1 <- forecast(fit1,h=12)
fit2 <- auto.arima(xshort,D=1, lambda=0)
fcast2 <- forecast(fit2,h=12)
fit3 <- ets(xshort)
fcast3 <- forecast(fit3,h=12)
mae1[i,] <- abs(fcast1[[’mean’]]-xnext)
mae2[i,] <- abs(fcast2[[’mean’]]-xnext)
mae3[i,] <- abs(fcast3[[’mean’]]-xnext)
}
plot(1:12,colMeans(mae1),type="l",col=2,xlab="horizon",ylab="MAE",
ylim=c(0.58,1.0))
lines(1:12,colMeans(mae2),type="l",col=3)
lines(1:12,colMeans(mae3),type="l",col=4)
legend("topleft",legend=c("LM","ARIMA","ETS"),col=2:4,lty=1)
Keep training window of fixed length:
xshort <- window(a10,start=i+1/12,end=1995+(5+i)/12)
Compute one-step forecasts in out-of-sample period:
for(i in 1:(n-k))
{
xshort <- window(a10,end=1995+(5+i)/12)
xlong <- window(a10,start=1995+(6+i)/12)
fit2 <- auto.arima(xshort,D=1, lambda=0)
fit2a <- Arima(xlong,model=fit2)
fit3 <- ets(xshort)
fit3a <- ets(xlong,model=fit3)
mae2a[i,] <- abs(residuals(fit3a))
mae3a[i,] <- abs(residuals(fit2a))
}
TBATS model
- Trigonometric terms for seasonality
- Box-Cox transformations for heterogeneity
- ARMA errors for short-term dynamics
- Trend (possibly damped)
- Seasonal (including multiple and non-integer periods)
check Dropbox/Documents/forecast/12-Cross-validation.pdf
for more detail