Documentation for programming projects can vary both in quality and availability. The availability of documentation can vary more for a closed working environment, since fewer developers will read the documentation. Documenting programming projects can be demanding on worker hours and unappreciated among developers. It is a common conception that developers rather invest time on developing a project than documenting a project, and making the documentation process more effective would benefit developers. To move towards a more automated process of writing documentation, this work generated documentation for repositories which attempts to summarize the repositories in their use cases and functionalities. Two different implementations are created to generate documentation using an on-premise hosted large language model (LLM) as a tool. First, the embedded solution processes all available code in a project and creates the documentation based on multiple summarizations of files and folders. Second, the RAG solution attempts to use only the most important parts of the code and lets the LLM create the documentation on a smaller set of the codebase. The results show that generating documentation is possible, but unreliable and must be controlled by a person with knowledge about the codebase. The embedded solution seems to be more reliable and produce better results, but is more costly compared to the RAG solution.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:ltu-107590 |
Date | January 2024 |
Creators | Hedlund, Ludvig |
Publisher | Luleå tekniska universitet, Institutionen för system- och rymdteknik |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0019 seconds