Volume 17
Issue 4
Economics
JOURNAL OF
POLISH
AGRICULTURAL
UNIVERSITIES
Available Online: http://www.ejpau.media.pl/volume17/issue4/art-03.html
MODELING AND FORECASTING RYE PRICE USING ARIMA MODEL
Małgorzata Szczepanik, Dorota Domagała
Department of Applied Mathematics and Computer Science, University of Life Sciences, Lublin, Poland
The study presents modeling of human-consumption rye price in Poland in the period: Aug 2010 to Oct 2013. ARIMA techniques are used to analyze the time series. ARIMA(1,1,1) with intervention model was applied and tested using the Box-Jenkins methodology (identification, estimation and diagnostic). The data were collected from the website of Polish Ministry of Agriculture and Rural Development. Diagnostics, calculations and forecasting were conducted using STATISTICA package.
Key words: time series analysis, ARIMA model, intervention, modeling, rye price.
INTRODUCTION
In 2011 Poland was the second biggest producer of rye in the world with the market share of about 19.7% after Russia (22.6%) (according to the data from the website http://stat.gov.pl/statystyka-miedzynarodowa/porownania-miedzynarodowe/tablice-o-krajach-wedlug-tematow/rolnictwo-i-rybolowstwo/). Over the past years rye has been a very important feed component for livestock. Rye flour is also the basis for a lot of bakery products. Rye bread and rolls are popular due to the good taste. In addition rye contains high amount of dietary fibre and is a source of mineral nutrients and B vitamins.
In the past years the prices for rye and other cereals changed very dynamically. In the analysed period Aug 2010-Oct 2013 the minimal and maximal prices of rye for human consumption in Poland were 373 PLN per ton (Aug 2010) and 921 PLN per ton (May 2012), respectively (Fig. 1). The aim of the study is to model and forecast human-consumption rye prices by using ARIMA techniques.
Finding an appropriate ARIMA model for forecasting is a research objective in many applications of time series, for example in tourism [3], electricity prices [2], vehicular traffic flow [9], electricity demand [8] or finances [4].
DATA
The weekly data of consumer rye price in Polish companies in the period from August 2010 to October 2013 are examined. The data were obtained from the website of Polish Ministry of Agriculture and Rural Development from News Flashes of Cereals Market at http://www.minrol.gov.pl/pol/Rynki-rolne/Zintegrowany-System-Rolniczej-Informacji-Rynkowej/Biuletyny-Informacyjne/Rynek-zboz. Analyzed weekly rye prices form he time series. One of the methods used in analyzing time series is the ARIMA technique.
MATERIALS
The ARMA (p, q) (autoregressive moving average) model of a time series refers to the model with p autoregressive terms and q moving average terms:
where X_{t} is the value of the series at time t, error terms ε_{t} are assumed to be identically and normally distributed with mean zero and variance σ^{2}. Moreover the error terms should be white noise i.e. independent and identically distributed and so all covariance terms are zero for all k > 0.
When the value of p or q is 0, the element is not needed in the model.
ARMA techniques can be used to model stationary time series. A stationary process has the property that the mean, variance and autocorrelation structure do not change over time. Frequently, in practice, a time series has trend and is not stationary. If the mean is not constant, in some cases the trend may be removed by differencing once Δ X_{t} = X_{t} – X_{t–1} or twice Δ^{2} X_{t} = ΔΔ X_{t} = Δ (X_{t} – X_{t–1}). A process X_{t} is said to be difference stationary if becomes stationary. A difference stationary process is also called integrated of order 1 and denoted by I(1).
In autoregressive integrated moving average model, ARIMA(p,d,q), d indicates the amount of differencing. The value of parameter d should be investigated before p and q. If the series is stationary, d = 0 and the ARIMA(p,0,q) model reduces to the ARMA(p,q) model. ARIMA(ARMA) models are also called Box-Jenkins models [1].
Differencing can also be applied at a seasonal lag and the relevant model is called SARIMA (seasonal ARIMA) model [9, 7].
There are three primary stages in building a Box-Jenkins time series model:
1. Model identification
If the series is non-stationary, the next step is data
transforming to stabilize variance or differencing data. Finally, potential values
p and q should be identified. The graph of ACF (the sample autocorrelation function)
and the graph of PACF (the sample partial autocorrelation function) could be
used for this purpose. Usually, the needed orders of p and q are less or equal
to 3.
2. Model estimation
The model parameters are estimated.
3. Diagnostics
Checking the normality and randomness of the residuals. If model
is not valid the step 1, 2, 3 have to be repeated.
STATISTICAL ANALYSIS AND RESULTS
Plots and calculations were conducted using STATISTICA package ver. 10.
According to the Box-Jenkins methodology the first step is examining of time series.
Model identification
Figure 1 shows weekly consumer rye prices in Poland from August 2010 to October
2013. There are shifts in both the mean (trend) and the variance over time. First,
natural-log transformation of the data is needed to stabilize the variability.
The mean first increases and then decreases over time as seen in Figure 1. Thus
the series should be differenced. We see in Figure 2 that the series after logarithmic
transformation and after one differencing (d = 1) is stationary (the amplitude
of changes is quite stable and there is no trend).
Fig. 1. Rye prices (PLN per ton) by
week in Polish companies from August 2010 to October 2013 |
Fig. 2. Differenced log price |
Moreover, one should observe in Figure 1 that there was very considerable reduction of rye price in the last week of July 2013. The price then actually declined from 561.69 to 433.83 PLN per ton during a week (in observation number 157). It could be reasonable to consider in the model the intervention effect (abrupt and permanent – the price did not increase to the previous level until October 2013). Intervention analysis begins with identification, estimation and diagnosis of the observations before the moment of intervention [3, 7, 5].
Now the pattern of the ACF (the sample autocorrelation function) and PACF (the sample partial autocorrelation function) plots should be examined. Both autocorrelation and partial autocorrelations are computed for sequential lags in the series. The first lag has the autocorrelation (partial autocorrelation) between Δ log X_{t–1} and Δ log X_{t}, the second lag has the autocorrelation (partial autocorrelation) between Δ log X_{t–2}, Δ log X_{t}, etc. The ACF and PACF plots in Figure 3 show ARIMA(p,d,q) pattern with p > 0 and q > 0 because both ACF and PACF decline and there are bigger peaks at lag 1, 2, 3 [7]. Therefore, it is reasonable to start with p = 1 and q = 1 (with one autoregressive and one moving average parameter to estimate).
Fig. 3. ACF and PACF plots of the
first difference of log(rye price) before intervention |
Model Estimation
The parameters or ARIMA(1,1,1) model without constant
are estimated with Melard’s
(exact) method [6]. We obtain φ = 0.949 and θ = 0.813.
Both parameters are statistically significant.
Normally several ARIMA(p,d,q) models are taken into consideration. In the study other models were also checked, including ARIMA(1,1,0), ARIMA(0,1,1) and ARIMA(2,1,0). However, the estimated parameters were insignificant or the models were not valid.
Model Validation
To examine the normality and randomness of the residuals, the histogram and
the normal density plot are made in Figure 4. The appearance of the residuals
indicates that the ARIMA(1,1,1) model fits the data reasonably well. Moreover,
the ACF and PACF of the residuals in Figure 5 show the independence.
Fig. 4. Histogram of residuals after
fitting an ARIMA(1,1,1) model |
Fig. 5. ACF and PACF of the residuals
after fitting the ARIMA(1,1,1) model |
Now the ARIMA(1,1,1) model can be used to entire series including observations after the intervention. Parameters, asymptotic standard errors and variances after re-estimating (in Statistica package with Melard’s alghoritm) ARIMA(1,1,1) model with abrupt and permanent intervention in 157^{th} observation are presented in Table 1. The parameter is connected with the intervention. The parameters are statistically significant.
Table 1. The estimation
output of ARIMA(1,1,1) model with intervention |
We should again examine normality and randomness of the residuals. Figures 6 and 7 show histogram of residuals and ACF, PACF of residuals respectively.
Fig. 6. Histogram of residuals after fitting an ARIMA(1,1,1) model with intervention into entire dataset |
Fig. 7. ACF and PACF of the residuals
after fitting the ARIMA(1,1,1) model with intervention into entire dataset |
The estimated model is
log X_{t} = log X_{t–1} + 0.884(log X_{t–1} – log X_{t–2}) – 0.727ε_{t–1 }– 0.247I (1)
where X_{t} denotes rye price in t-th week and I is a dummy variable representing the intervention:
The intervention parameter in the model (in 157^{th} week) is connected with a considerable abrupt fall of rye price, which occurred in Polish (and also global) cereal market after 2013 crop season.
We can use the model (1) to make short-term forecast. The forecast and 90% confidence prediction intervals of rye prices for next seven weeks are presented in Table 2. As seen the prediction intervals are rather wide. Moreover there is the plot of the original series and the forecast at Figure 8.
Table 2. The forecast
(PLN per ton) and 90% confidence prediction interval of consumer rye price for
next seven weeks |
(observation) |
|||
Fig. 8. The forecast (PLN per ton)
of consumer rye price |
The quality of the forecast can be confirmed by calculating mean absolute percentage error (MAPE). We obtain MAPE = 3.35%.
CONCLUSIONS
We estimated the ARIMA(1,1,1) model with intervention to model consumer rye price in Polish companies, in the period from August 2010 to October 2013, in the form (1). It required analysis of time series according to Box-Jenkins methodology before intervention and then conducting next verification of the model after including all the observations. The low value of MAPE shows the usefulness of the model (1) in forecasting.
- Box G.E.P, Jenkins G.M., 1970. Time series analysis: Forecasting and control. San Francisco, Holden-Day.
- Contreras J., Espínola R., Nogales F.J., Conejo A.J., 2003. ARIMA Models to Predict Next-Day Electricity Prices. IEEE Trans. Power Syst. 18 (3), 1014–1020.
- Goh C., Law R., 2002. Modeling and forecasting tourism demand for arrivals with stochastic nonstationary seasonality and intervention. Tourism Management, 23 (5), 499–510.
- Hussein Z., 2011. Modelling and Forecasting Volatility Using ARIMA Model. European Journal of Economics, Finance and Administrative Sciences, 3, 109–125.
- Lam C.Y., Ip W.H., Lau C.W., 2009. A business process activity model and performance measurement using a time series ARIMA intervention analysis, Expert Systems with Applications, 36 (3), 6986–6994.
- Melard G., 1984. A fast algorithm for the exact likelihood of autoregressive-moving average models. AppliedStatistics, 33 (1), 104–119.
- Tabachnick B.G., Fidell L.S., 2006. Using Multivariate Statistics. 5th ed. Pearson Education.
- Trojanowska M., Małopolski J., 2005. Forecasting of electricity demand in rural areas. Part II. Comparison of applicability of ARIMA and Takagi-Sugeno models. Electronic Journal of Polish Agricultural Universities, 8 (4), #26.
- Williams B.M., Hoel. L.A., 2003. Modeling and Forecasting Vehicular Traffic Flow as a Seasonal ARIMA Process: Theoretical Basis and Empirical Results. Journal of Transportation Engineering, 129 (6), 664–672.
Accepted for print: 6.11.2014
Małgorzata Szczepanik
Department of Applied Mathematics and Computer Science, University of Life Sciences, Lublin, Poland
Głęboka 28
20-950 Lublin
Poland
email: malgorzata.szczepanik@up.lublin.pl
Dorota Domagała
Department of Applied Mathematics and Computer Science, University of Life Sciences, Lublin, Poland
Głęboka 28
20-950 Lublin
Poland
Responses to this article, comments are invited and should be submitted within three months of the publication of the article. If accepted for publication, they will be published in the chapter headed 'Discussions' and hyperlinked to the article.