Return to search

Usage of Generative AI Based Plugin in Unit Testing : Evaluating the Trustworthiness of Generated Test Cases by Codiumate, an IDE Plugin Powered by GPT-3.5 & 4

Background: Unit testing is essential in software development, ensuring the functionality of individual components like functions and classes. However, manual creation of unit test cases is time-consuming and tedious, impacting testing efficiency and reliability. Problem: Automated unit test generation tools such as EvoSuite and Randoop have addressed some challenges, but they’re limited by language specificity and predefined algorithms. Generative AI tools like ChatGPT and GitHub Copilot powered by OpenAI’sGPT-3.5/4 offer alternatives, but face limitations like user input reliance and operational inconveniences. Solution: CodiumAI’s Codiumate IDE plugin aims to mitigate these limitations, making code quality assurance easier for developers. This study evaluates Codiumate’s trustworthiness in generating unit tests for the Python functions. Method: We randomly selected thirty functions from OpenAI’s HumanEval dataset, and wrote selection criteria for relevant test cases based on each function’s doc string to evaluate Codiumate’s trustworthiness using metrics such as Relevance Score, false positive rate, and result consistency rate. Result: Among all the suggested test cases by Codiumate, 208 unit tests, which consists of 48% of suggested test cases that were relevant. 70% of assertions from these test cases strictly meet selection criteria, while the other 30% while relevant were selected due to our basis and experience in software testing. The average false positive rate is15%. Function groups that have higher Relevance Scores are non-mathematical nature, and simple dependencies. High false positives arise in functions with string and float parameters. All generated unit tests are syntax-error-free, with 20% fail and 80% passed in all five test execution. Conclusion: Codiumate demonstrates potential in automating unit test generation, offering a convenient means to support developers. However, it is not yet fully reliable for critical applications without developer oversight. Continued refinement and exploration of its capabilities are essential for Codiumate to become an indispensable asset in unit test generation, enhancing its trustworthiness and effectiveness in the software development process.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:bth-26473
Date January 2024
CreatorsNazari, Ali Reza, Nannicha Thunell, Bow
PublisherBlekinge Tekniska Högskola, Institutionen för programvaruteknik
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0026 seconds