AbstractsComputer Science

Iterative root cause analysis using data mining in software testing processes

by J (Juho) Roberts

Institution: University of Oulu
Year: 2016
Keywords: Information Processing Science
Posted: 02/05/2017
Record ID: 2070818
Full text PDF: http://urn.fi/URN:NBN:fi:oulu-201604271548


In order to remain competitive, companies need to be constantly vigilant and aware of the current trends in the industry in which they operate. The terms big data and data mining have exploded in popularity in recent years, and will continue to do so with the launch of the internet of things (IoT) and the 5th generation of mobile networks (5G) in the next decade. Companies need to recognize the value of the big data they are generating in their day-to-day operations, and learn how and why to exploit data mining techniques to extract the most knowledge out of the data their customers and the company itself are generating. The root cause analysis of faults uncovered during base station system testing is a difficult process due to the profound complexity caused by the multi-disciplinary nature of a base station system, and the sheer volume of log data outputted by the numerous system components. The goal of this research is to investigate if data mining can be exploited to conduct root cause analysis. It took the form of action research and is conducted in industry at an organisation unit responsible for the research and development of mobile base station equipment. In this thesis, we survey existing literature on how data mining has been used to address root cause analysis. Then we propose a novel approach to root cause analysis by making iterations to the root cause analysis process with data mining. We use the data mining tool Splunk in this thesis as an example; however, the practices presented in this research can be applied to other similar tools. We conduct root cause analysis by mining system logs generated by mobile base stations, to investigate which system component is causing the base station to fall short of its performance specifications. We then evaluate and validate our hypotheses by conducting a training session for the test engineers to collect feedback on the suitability of data mining in their work. The results from the evaluation show that amongst other benefits, data mining makes root cause analysis more efficient, but also makes bug reporting in the target organisation more complete. We conclude that data mining techniques can be a significant asset in root cause analysis. The efficiency gains are significant in comparison to the manual root cause analysis which is currently being conducted at the target organisation. Kilpailuedun säilyttämiseksi yritysten on pysyttävä ajan tasalla markkinoiden viimeisimpien kehityssuuntien kanssa. Massadata ja sen jatkojalostaminen, eli tiedonlouhinta, ovat tällä hetkellä mm. IT- ja markkinointialan muotisanoja. Esineiden internetin ja viidennen sukupolven matkapuhelinverkon (5G) yleistyessä tiedonlouhinnan merkitys tulee kasvamaan entisestään. Yritysten on kyettävä tunnistamaan luomansa massadatan merkitys omissa toiminnoissaan, ja mietittävä kuinka soveltaa tiedonlouhintamenetelmiä kilpailuedun luomiseksi. Matkapuhelinverkon tukiasemien vika-analyysi on haastavaa tukiasemien monimutkaisen luonteen sekä valtavan datamäärän ulostulon vuoksi. Tämän…