Global ETD Search

Return to search

Investigating topic modeling techniques for historical feature location.

Software maintenance and the understanding of where in the source code features are implemented are two strongly coupled tasks that make up a large portion of the effort spent on developing applications. The concept of feature location investigated in this thesis can serve as a supporting factor in those tasks as it facilitates the automation of otherwise manual searches for source code artifacts. Challenges in this subject area include the aggregation and composition of a training corpus from historical codebase data for models as well as the integration and optimization of qualified topic modeling techniques. Building up on previous research, this thesis provides a comparison of two different techniques and introduces a toolkit that can be used to reproduce and extend on the results discussed. Specifically, in this thesis a changeset-based approach to feature location is pursued and applied to a large open-source Java project. The project is used to optimize and evaluate the performance of Latent Dirichlet Allocation models and Pachinko Allocation models, as well as to compare the accuracy of the two models with each other. As discussed at the end of the thesis, the results do not indicate a clear favorite between the models. Instead, the outcome of the comparison depends on the metric and viewpoint from which it is assessed.

http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-85379

feature location

topic modeling

changesets

latent dirichlet distribution

pachinko alloca-tion

mining software repositories

source code comprehension

Software Engineering

Programvaruteknik

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kau-85379
Date	January 2021
Creators	Schulte, Lukas
Publisher	Karlstads universitet, Institutionen för matematik och datavetenskap (from 2013)
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0017 seconds

Investigating topic modeling techniques for historical feature location.

Description

Links & Downloads

Tags

Additional Fields