• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

NLP-based Failure log Clustering to Enable Batch Log Processing in Industrial DevOps Setting

Homayouni, Ali January 2022 (has links)
The rapid development, updating, and maintenance of industrial software systems have increased the necessity for software artifact testing. Some medium and large industries are forced to automate the test analysis process due to the proliferation of test data. The examination of test results can be automated by grouping them into subsets comprised of comparable test outcomes and their batch analysis. In this instance, the first step is to identify a precise and reliable categorization mechanism based on structural similarities and error categories. In addition, since errors and the number of subgroups are not specified, a method that does not require prior knowledge of the target subsets should be implemented. Clustering is one of the appropriate methods for separating test results, given this description. This work presents an appropriate approach for grouping test results and accelerating the test analysis process by implementing multiple clustering algorithms (K-means, Agglomerative, DBSCAN, Fuzzy-c-means, and Spectral) on test results from industrial contexts and comparing their time and efficiency in outputs. The lack of organization and textual character of the test findings is one of the primary obstacles in this study, necessitating the implementation of feature selection methods. Consequently, this study employs three distinct approaches to feature selection (TF-IDF, FastText, and Bert). This research was conducted by implementing a series of trials in a controlled and isolated environment, with the assistance of Westermo Technologies AB's test process results, as part of the AIDOaRT Project, in order to establish an acceptable way for clustering industrial test results. The conclusion of this thesis shows that K-means and Agglomerative yield the highest performance and evaluation scores; however, the K-means is superior in terms of execution time and speed. In addition, by organizing a Focus Group meeting to qualitatively examine the results from the perspective of engineers and experts, it can be determined that, from their perspective, clustering results increases the speed of test analysis and decreases the review workload.

Page generated in 0.0946 seconds