AbstractsBusiness Management & Administration

Machine Learning Approach for Crude Oil Price Prediction

by Siti Norbaiti Abdullah




Institution: University of Manchester
Department:
Year: 2014
Keywords: crude oil price prediction; oil prediction; oil price prediction; machine learning; sentiment-mining; AI hybrid models; ANN; hierarchical conceptual model; linguistic prediction model; quantitative prediction model
Record ID: 1406700
Full text PDF: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:222502


Abstract

Crude oil prices impact the world economy and are thus of interest to economic experts and politicians. Oil price’s volatile behaviour, which has moulded today’s world economy, society and politics, has motivated and continues to excite researchers for further study. This volatile behaviour is predicted to prompt more new and interesting research challenges. In the present research, machine learning and computational intelligence utilising historical quantitative data, with the linguistic element of online news services, are used to predict crude oil prices via five different models: (1) the Hierarchical Conceptual (HC) model; (2) the Artificial Neural Network-Quantitative (ANN-Q) model; (3) the Linguistic model; (4) the Rule-based Expert model; and, finally, (5) the Hybridisation of Linguistic and Quantitative (LQ) model. First, to understand the behaviour of the crude oil price market, the HC model functions as a platform to retrieve information that explains the behaviour of the market. This is retrieved from Google News articles using the keyword “Crude oil price”. Through a systematic approach, price data are classified into categories that explain the crude oil price’s level of impact on the market. The price data classification distinguishes crucial behaviour information contained in the articles. These distinguished data features ranked hierarchically according to the level of impact and used as reference to discover the numeric data implemented in model (2). Model (2) is developed to validate the features retrieved in model (1). It introduces the Back Propagation Neural Network (BPNN) technique as an alternative to conventional techniques used for forecasting the crude oil market. The BPNN technique is proven in model (2) to have produced more accurate and competitive results. Likewise, the features retrieved from model (1) are also validated and proven to cause market volatility. In model (3), a more systematic approach is introduced to extract the features from the news corpus. This approach applies a content utilisation technique to news articles and mines news sentiments by applying a fuzzy grammar fragment extraction. To extract the features from the news articles systematically, a domain-customised ‘dictionary’ containing grammar definitions is built beforehand. These retrieved features are used as the linguistic data to predict the market’s behaviour with crude oil price. A decision tree is also produced from this model which hierarchically delineates the events (i.e., the market’s rules) that made the market volatile, and later resulted in the production of model (4). Then, model (5) is built to complement the linguistic character performed in model (3) from the numeric prediction model made in model (2). To conclude, the hybridisation of these two models and the integration of models (1) to (5) in this research imitates the execution of crude oil market’s regulators in calculating their risk of actions before executing a price hedge in the market, wherein risk calculation is based on the ‘facts’…