ARIMA analysis using R

If you like this, Share!

Hey guys, this is a small article where in I talk about ARIMA modelling in R. The method is simple and straight forward. So, let’s start off. Firstly we need some data for the modeling.  Head over to the  link to get some time series data.

The data that we are using here is a time series data meaning it is recorded over a period of equal intervals of time.

Now,  since you have the data lets get to work. Firstly lets start off by loading some libraries. The required libraries are

  • forecast
  • tseries

Below the R code for loading the libraries.

require(forecast)
 require(tseries)

Now we need to assign the data that we have to a variable. We do this by.

y<-read.csv(file.choose(), header=T)

The data is stored in a variable known as y. Now let’s make this into a time series data. This can be tone by using the ts command in R.

yobs<-ts(y$Obs, start=c(2003,1), frequency=12)

Here I am setting the frequency as 12 since the data is a monthly data that starts on January 2003.

Let’s plot the data now to see if there are any abnormalities in it.

plot(yobs)

We need to remove the trend, and the best way to do this is by differentiating the data. To differentiate the data the diff command is used.

plot(diff(yobs))


Looking at the above data, the variance is not constant with time. To make the variance constant logging the data is a good method.

plot(log10(yobs))
 plot(diff(log10(yobs)))

To check for stationary series, we need to Conduct the Dicker-fuller test to check for stationary series

adf.test(log10(yobs))

Augmented Dickey-Fuller Test

data:  log10(yobs)
 Dickey-Fuller = -4.7137, Lag order = 4, p-value = 0.01
 alternative hypothesis: stationary

The Series is stationary, so now we can do the arima modelling. Using the auto.arima command the AR and MA values are obtained automatically.

 #Conducting the Arima
 a1<-auto.arima(log10(yobs),approximation=FALSE,trace=FALSE)
 #Arima modeling Done!

summary(a1)
Series: log10(yobs)
 ARIMA(1,0,1)(0,1,1)[12] with drift
Coefficients:
 ar1      ma1     sma1   drift
 0.8213  -0.2572  -0.6454  0.0048
 s.e.  0.0902   0.1508   0.1124  0.0003

sigma^2 estimated as 0.0002722:  log likelihood=224.21
 AIC=-438.43   AICc=-437.66   BIC=-426.27

Training set error measures:
 ME       RMSE        MAE        MPE      MAPE      MASE
 Training set 0.0004781441 0.01506067 0.01091572 0.01796991 0.4745754 0.1828288
 ACF1
 Training set -0.02601957
 #Checking for  independence in a given time series
 Box.test(a1$residuals, lag=12, type = "Ljung-Box")
 #We accept the null hypostesis and the model is a Success!

Box-Ljung test

data:  a1$residuals
 X-squared = 4.8448, df = 12, p-value = 0.963
 f<-forecast(a1, h=12)
 plot(f)

Forecasting for 12 months

f<-forecast(a1, h=12)
 plot(f)

Decomposing the data

h1<-stl(yobs, s.window="periodic")
 plot(h1)

Leave a Reply

Be the First to Comment!

  Subscribe  
Notify of