1 |
An Empirical Study of API Breaking Changes in BioconductorChowdhury, Hemayet Ahmed 10 January 2023 (has links)
Bioconductor is the second largest R software package repository that is primarily used for the analysis of genomic and biological data. With downloads exceeding millions in recent years, the widespread growth of the repository's adoption can be attributed to it's diverse selection of community-created packages, written in the programming language R, that allow statistical methodologies for analysis and modelling of data. However, as these packages evolve, their APIs go through changes that can break existing user code. Fixing these API breaking changes whenever a package is updated can be frustrating and time-consuming, especially since a large fraction of the user community are researchers who do not necessarily have software engineering background. In that context, we first present a tool that can detect syntactic API breaking changes between two released versions of a library written in R through static analysis of the package source code. This tool can be of utility to R package developers, so that they can more comprehensively report or handle the breaking changes in their releases, and to R package users, who want to be aware of the API differences that may exist between two releases before upgrading the libraries in their code. Through the use of this tool and manual inspection, we also conducted an empirical study of the breaking changes and backward incompatibility in Bioconductor packages. We studied the 100 most downloaded packages in the repository and found that 28% of all packages releases are backward incompatible. We also found that 55% of these breaking changes go undocumented and developers don't maintain semantic versioning for 22% of the releases. Finally, we manually inspected 10 library releases that consisted of breaking changes and found 2% of the API-s to affect 31 client projects. / Master of Science / Bioconductor is a software repository that consists of over 2000 software libraries. These libraries can provide users with reusable functions, or APIs, to perform statistical and graphical data analysis. The developers of these libraries will generally make timely updates to the library source code and the functions for various maintainability purposes. However, when clients install these library updates in their existing code, their code might not compile, run or behave the same way it used to anymore due to the changes made in the APIs of the libraries. Such a library release that consists of changes that can potentially break older code is considered to be backward incompatible. Without proper documentation from the library developer's side, fixing these issues can be time consuming as the client might have to manually look at the changes made in the library's source code. In order to tackle this issue, we first present a tool that can analyse two versions of a library and identify a subset of the breaking changes in the API. This can be helpful for both the users and the developers of the libraries to be aware of any breaking changes that exist in a new release. Afterwards, we conduct a study on the Bioconductor ecosystem to see how serious the problem of backward incompatibility really is by studying the top 100 most downloaded packages from the repository. We see that 28% of the releases across these 100 packages are backward incompatible.
Since clients are likely to be using multiple libraries at once, this figure can potentially cause frequent issues in client code. We then go on to check how often developers maintain the correct release protocols when updating their libraries. These include versioning the releases in correct ways, so as to let the users be aware of what releases may be backward incompatible and documenting any breaking changes that occur in a NEWS file that users have access to. In that aspect, we find that 22% of the releases are not versioned correctly and roughly 55% of the breaking changes in the API are not documented. Finally, we investigate how frequently these breaking changes can actually affect client code. Here, we manually inspect 10 releases with a high number of a subset of the breaking changes and find 31 projects that implement these APIs, which would break upon a library update.
|
2 |
UCov : A Static Analysis Tool for API Usage Coverage Validation / UCov : Ett statiskt analysverktyg för validering av API-användningstäckningCouturou, Thomas January 2023 (has links)
Nowadays, all software projects are based on a large number of libraries, so they do not have to start from scratch. These libraries evolve over time, whether to add functionality or to simplify their use. These updates are necessary to improve their libraries, but can lead to errors in their clients’ code. Developers are thus faced with the problem of breaking changes and need to be able to inform their clients as soon as possible of the arrival of these changes. To limit the impact of these breaking changes, this Master thesis presents UCov. UCov is a static analysis tool that gives library developers a quick overview of the usage coverage of their tests compared with the coverage of their clients. This lets them to compare which elements of their library are being tested with those being used by their clients and also how they are being used. This will enable developers to improve their test suite according to how their clients use their library, to get a better overview of how their library is used, and also to give their clients the best possible warning of changes that may impact their code, thanks to release notes. In this study, we explain the implementation of UCov and test it on various libraries. The results obtained on these libraries are satisfactory. They enable us to highlight potential breaking changes. They also show that there are elements of the libraries’ APIs that are used by clients but never tested. Finally, these results show that UCov offers developers a new tool enabling them to limit the impact of their breaking changes by gaining a better understanding of how clients use their libraries. / Numera baseras alla programvaruprojekt på ett stort antal bibliotek, så att man inte behöver börja från noll. Biblioteksutvecklare ställs därför allt oftare inför problemet med “breaking changes". Dessa är nödvändiga för att förbättra deras bibliotek, men kan leda till fel i deras kunders kod. Utvecklarna måste därför kunna ge sina kunder så mycket förvarning som möjligt om att det kommer att komma inbrytande ändringar. För att begränsa effekterna av dessa förändringar presenterar denna masteruppsats UCov. UCov är ett statiskt analysverktyg som ger biblioteksutvecklare en snabb överblick över användartäckningen för deras tester jämfört med täckningen för deras klienter. Detta gör det möjligt för dem att jämföra vilka element i deras bibliotek som testas med de som används av deras kunder, och även hur de används. Detta gör det möjligt för utvecklare att förbättra sin testsvit enligt hur deras kunder använder deras bibliotek, för att få en bättre överblick över hur deras bibliotek används, och även för att ge sina kunder bästa möjliga varning för ändringar som kan påverka deras kod, tack vare release notes. I den här studien förklarar vi implementeringen av UCov och testar den på olika bibliotek. Resultaten från dessa bibliotek är tillfredsställande. De gör det möjligt för oss att lyfta fram potentiella brytande ändringar. De visar också att det finns delar av bibliotekens API:er som används av klienter men som aldrig testas. Slutligen visar dessa resultat att UCov erbjuder utvecklare ett nytt verktyg som gör det möjligt för dem att begränsa effekterna av sina skadliga ändringar genom att få en bättre förståelse för hur kunderna använder sina bibliotek.
|
Page generated in 0.0689 seconds