This thesis investigates the capabilities of two advanced Large Language Models(LLMs) OpenAI’s ChatGPT-4 and Google’s Gemini Advanced in the domain ofSoftware engineering. While LLMs are widely utilized across various applications,including text summarization and synthesis, their potential for detecting and correct-ing programming errors has not been thoroughly explored. This study aims to fill thisgap by conducting a comprehensive literature search and experimental comparisonof ChatGPT-4 and Gemini Advanced using the QuixBugs and LeetCode benchmarkdatasets, with specific focus on Python and Java programming languages. The re-search evaluates the models’ abilities to detect and correct bugs using metrics suchas Accuracy, Recall, Precision, and F1-score.Experimental results presets that ChatGPT-4 consistently outperforms GeminiAdvanced in both the detection and correction of bugs. These findings provide valu-able insights that could guide further research in the field of LLMs.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:lnu-130490 |
Date | January 2024 |
Creators | Sun, Erik Wen Han, Grace, Yasine |
Publisher | Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0146 seconds