Return to search

Self-learning algorithms applied in Continuous Integration system

Context: Continuous Integration (CI) is a software development practice where a developer integrates a code into the shared repository. And, then an automated system verifies the code and runs automated test cases to find integration error. For this research, Ericsson’s CI system is used. The tests that are performed in CI are regression tests. Based on the time scopes, the regression test suites are categorized into hourly and daily test suits. The hourly test is performed on all the commits made in a day, whereas the daily test is performed at night on the latest build that passed the hourly test. Here, the hourly and daily test suites are static, and the hourly test suite is a subset of the daily test suite. Since the daily test is performed at the end of the day, the results are obtained on the next day, which is delaying the feedback to the developers regarding the integration errors. To mitigate this problem, research is performed to find the possibility of creating a learning model and integrating into the CI system, which can then create a dynamic hourly test suite for faster feedback. Objectives: This research aims to find the suitable machine learning algorithm for CI system and investigate the feasibility of creating self-learning test machinery. This goal is achieved by examining the CI system and, finding out what type data is required for creating the learning model for prioritizing the test cases. Once the necessary data is obtained, then the selected algorithms are evaluated to find the suitable learning algorithm for creating self-learning test machinery. And then, the investigation is done whether the created learning model can be integrated into the CI workflow to create the self-learning test machinery. Methods: In this research, an experiment is conducted for evaluating the learning algorithms. For this experimentation, the data is provided by Ericsson AB, Gothenburg. The dataset consists of the daily test information and the test case results. The algorithms that are evaluated in this experiment are Naïve Bayes, Support vector machines, and Decision trees. This evaluation is done by performing leave-one-out cross-validation. And, the learning algorithm performance is calculated by using the prediction accuracy. After obtaining the accuracies, the algorithms are compared to find the suitable machine learning algorithm for CI system. Results: Based on the Experiment results it is found that support vector machines have outperformed Naïve Bayes and Decision tree algorithms in performance. But, due to the challenges present in the current CI system, the created learning model is not feasible to integrate into the CI. The primary challenge faced by the CI system is, mapping of test case failure to its respective commit is no possible (cannot find which commit made the test case to fail). This is because the daily test is performed on the latest build which is the combination of commits made in that day. Another challenge present is low data storage. Due to this low data storage, problems like the curse of dimensionality and class imbalance has occurred. Conclusions: By conducting this research, a suitable learning algorithm is identified for creating a self-learning machinery. And, also identified the challenges facing to integrate the model in CI. Based on the results obtained from the experiment, it is recognized that support vector machines have high prediction accuracy in test case result classification compared to Naïve Bayes and Decision trees.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:bth-16675
Date January 2018
CreatorsTummala, Akhil
PublisherBlekinge Tekniska Högskola, Institutionen för datalogi och datorsystemteknik
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0025 seconds