Context - In the fast-paced world of software development, understanding and tracking contributions within project teams is crucial for efficient project management and collaboration. Git, a popular Version Control System, facilitates collaboration but lacks comprehensive tools for analyzing individual contributions in detail. Objective - This thesis proposes an approach to classify and analyze Git commit messages and the associated file paths of the changed files in the commits, using Natural Language Processing (NLP) techniques, aiming to improve project transparency and contributor recognition. Method - By employing Bidirectional Encoder Representations from Transformers (BERT) models, an NLP technique, this study categorizes data from multiple collected Git repositories. A tool named DevAnalyzer is developed to automate the classification and analysis process, enhancing the understanding of contribution patterns. Results - The Git commit message model demonstrated high accuracy with an average of 98.9%, and the file path model showed robust performance with an average accuracy of 99.8%. Thereby, both models provided detailed insights into the types and locations of contributions within projects. Conclusions - The findings validate the effectiveness of using BERT models for classifying and categorizing both Git commit messages and file paths with the DevAnalyzer. This approach provides a more comprehensive understanding of contributions, benefiting project management and team collaboration.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:miun-51824 |
Date | January 2024 |
Creators | Nimér, Ebba, Pesjak, Emma |
Publisher | Mittuniversitetet, Institutionen för kommunikation, kvalitetsteknik och informationssystem (2023-) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0027 seconds