This thesis aims to uncover anomalies in the data describing the performance behavior of a "robot controller" as measured by software metrics. The purpose of analyzing data is mainly to identify the changes that have resulted in different performance behaviors which we refer to as performance anomalies. To address this issue, two separate pre-processing approaches have been developed: one that adds the principal component to the data after cleaning steps and another that does not regard the principal component. Next, Isolation Forest is employed, which uses an ensemble of isolation trees for data points to segregate anomalies and generate scores that can be used to discover anomalies. Further, in order to detect anomalies, the highest distances matching cluster centroids are employed in the clustering procedure. These two data preparation methods, along with two anomaly detection algorithms, identified software builds that are very likely to be anomalies. According to an industrial evaluation conducted based on engineers’ domain knowledge, around 70% of the detected software builds as anomalous builds were successfully identified, indicating system variable deviations or software bugs.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-480514 |
Date | January 2022 |
Creators | Salahshour Torshizi, Sara |
Publisher | Uppsala universitet, Statistiska institutionen, RISE-Research Institue of Sweden |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0012 seconds