Global ETD Search

11	Grouping Similar Bug Reports from Crash Dumps with Unsupervised Learning / Gruppering av liknande felrapporter med oövervakat lärande av kraschdumpar Vestergren, Sara January 2021 (has links) Quality software usually means high reliability, which in turn has two main components; the software should provide correctness, which means it should perform the specified task, and robustness in the sense that it should be able to manage unexpected situations. In other words, reliable systems are systems without bugs. Because of this, testing and debugging are recurrent and resource expensive tasks in software development, notably in large software systems. This thesis investigate the potential of using unsupervised machine learning on Ericsson bug reports to avoid unnecessary debugging by identifying duplicate bug reports. The bug report data that is considered are crash dumps from software crashes. The data is clustered using the clustering algorithms k-modes, k-prototypes and expectation maximization where-after the generated clusters are used to assign new incoming bug reports to the previously generated clusters, thus indicating whether an old bug report is similar to the newly submitted one. Due to the dataset only being partially labeled both internal and external validity indices are applied to evaluate the clustering. The results indicate that many, small clusters can be identified using the applied method. However, for the results to have high validity the methods could be applied on a larger data set. / Mjukvara av hög kvalitet innebär ofta hög tillförlitlighet, vilket i sin tur har två huvudkomponenter; mjukvaran bör vara korrekt, den ska alltså uppfylla dom specifierade kraven, och dessutom robust vilket innebär att den ska kunna hantera oväntade situationer. Med andra ord, tillförlitliga system är system utan buggar. På grund av detta är testning och felsökning återkommande och resurskrävande uppgifter inom mjukvaruutveckling, i synnerhet för stora mjukvarusystem. Detta arbete utforskar vilken potential oövervakad maskininlärning på Ericssons felrapporter har för att undvika onödig felsökning genom att identifiera felrapporter som är dubletter. Felrapporterna som används i detta arbete innehåller data som sparats i minnet vid en mjukvarukrasch. Data klustras sedan med klustringsalgoritmerna k-modes, k-prototypes och expectation maximization varpå dom genererade klustren används för att tilldela nya inkommande felrapporter till de tidigare generade klustren, för att på så sätt kunna identifiera om en gammal felrapport är lik en ny felrapport. Då de felrapporter som behandlas endast till viss del redan är märkta som dubletter används både externa och interna valideringsmått för att utvärdera klustringen. Resultaten tyder på att många, små kluster kunde identifieras med de använda metoderna. Dock skulle metoderna behöva appliceras på ett dataset med större antal felrapporter för att resultaten ska få hög validitet. Unsupervised Learning Bug Report Duplicate Detection Clustering Software Crash Oövervakad Inlärning Felrapport Dublett-detektering Klustring Mjukvarukrasch Elektroteknik och elektronik
12	Tuning of machine learning algorithms for automatic bug assignment Artchounin, Daniel January 2017 (has links) In software development projects, bug triage consists mainly of assigning bug reports to software developers or teams (depending on the project). The partial or total automation of this task would have a positive economic impact on many software projects. This thesis introduces a systematic four-step method to find some of the best configurations of several machine learning algorithms intending to solve the automatic bug assignment problem. These four steps are respectively used to select a combination of pre-processing techniques, a bug report representation, a potential feature selection technique and to tune several classifiers. The aforementioned method has been applied on three software projects: 66 066 bug reports of a proprietary project, 24 450 bug reports of Eclipse JDT and 30 358 bug reports of Mozilla Firefox. 619 configurations have been applied and compared on each of these three projects. In production, using the approach introduced in this work on the bug reports of the proprietary project would have increased the accuracy by up to 16.64 percentage points. bug triage bug assignment bug mining bug report activity-based approach issue tracking bug repository bug tracker pre-processing feature extraction feature selection tuning model selection hyper-parameter optimization text mining text classification classifier supervised learning machine learning information retrieval bugzilla eclipse jdt mozilla firefox open source software proprietary project accuracy mean reciprocal rank software development software maintenance software engineering Computer and Information Sciences Data- och informationsvetenskap

Search results

Grouping Similar Bug Reports from Crash Dumps with Unsupervised Learning / Gruppering av liknande felrapporter med oövervakat lärande av kraschdumpar

Tuning of machine learning algorithms for automatic bug assignment