ARIMA analysis using R

Hey guys, this is a small article where in I talk about ARIMA modelling in R. The method is simple and straight forward. So, let’s start off. Firstly we need some data for the modeling.  Head over to the  link to get some time series data.

The data that we are using here is a time series data meaning it is recorded over a period of equal intervals of time.

Now,  since you have the data lets get to work. Firstly lets start off by loading some libraries. The required libraries are

• forecast
• tseries

```require(forecast)
require(tseries)```

Now we need to assign the data that we have to a variable. We do this by.

`y<-read.csv(file.choose(), header=T)`

The data is stored in a variable known as y. Now let’s make this into a time series data. This can be tone by using the ts command in R.

`yobs<-ts(y\$Obs, start=c(2003,1), frequency=12)`

Here I am setting the frequency as 12 since the data is a monthly data that starts on January 2003.

Let’s plot the data now to see if there are any abnormalities in it.

`plot(yobs)`

We need to remove the trend, and the best way to do this is by differentiating the data. To differentiate the data the diff command is used.

`plot(diff(yobs))`

Looking at the above data, the variance is not constant with time. To make the variance constant logging the data is a good method.

```plot(log10(yobs))
plot(diff(log10(yobs)))```

To check for stationary series, we need to Conduct the Dicker-fuller test to check for stationary series

`adf.test(log10(yobs))`

Augmented Dickey-Fuller Test

```data:  log10(yobs)
Dickey-Fuller = -4.7137, Lag order = 4, p-value = 0.01
alternative hypothesis: stationary```

The Series is stationary, so now we can do the arima modelling. Using the auto.arima command the AR and MA values are obtained automatically.

``` #Conducting the Arima
a1<-auto.arima(log10(yobs),approximation=FALSE,trace=FALSE)
#Arima modeling Done!

summary(a1)```
```Series: log10(yobs)
ARIMA(1,0,1)(0,1,1)[12] with drift```
```Coefficients:
ar1      ma1     sma1   drift
0.8213  -0.2572  -0.6454  0.0048
s.e.  0.0902   0.1508   0.1124  0.0003

sigma^2 estimated as 0.0002722:  log likelihood=224.21
AIC=-438.43   AICc=-437.66   BIC=-426.27

Training set error measures:
ME       RMSE        MAE        MPE      MAPE      MASE
Training set 0.0004781441 0.01506067 0.01091572 0.01796991 0.4745754 0.1828288
ACF1
Training set -0.02601957
#Checking for  independence in a given time series
Box.test(a1\$residuals, lag=12, type = "Ljung-Box")
#We accept the null hypostesis and the model is a Success!```

Box-Ljung test

```data:  a1\$residuals
X-squared = 4.8448, df = 12, p-value = 0.963
f<-forecast(a1, h=12)
plot(f)```

Forecasting for 12 months

```f<-forecast(a1, h=12)
plot(f)```

Decomposing the data

```h1<-stl(yobs, s.window="periodic")
plot(h1)```