Return to search

Increasing Reproducibility Through Provenance, Transparency and Reusability in a Cloud-Native Application for Collaborative Machine Learning

The purpose of this thesis paper was to develop new features in the cloud-native and open-source machine learning platform STACKn, aiming to strengthen the platform's support for conducting reproducible machine learning experiments through provenance, transparency and reusability. Adhering to the definition of reproducibility as the ability of independent researchers to exactly duplicate scientific results with the same material as in the original experiment, two concepts were explored as alternatives for this specific goal: 1) Increased support for standardized textual documentation of machine learning models and their corresponding datasets; and 2) Increased support for provenance to track the lineage of machine learning models by making code, data and metadata readily available and stored for future reference. We set out to investigate to what degree these features could increase reproducibility in STACKn, both when used in isolation and when combined.  When these features had been implemented through an exhaustive software engineering process, an evaluation of the implemented features was conducted to quantify the degree of reproducibility that STACKn supports. The evaluation showed that the implemented features, especially provenance features, substantially increase the possibilities to conduct reproducible experiments in STACKn, as opposed to when none of the developed features are used. While the employed evaluation method was not entirely objective, these features are clearly a good first initiative in meeting current recommendations and guidelines on how computational science can be made reproducible.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-435349
Date January 2021
CreatorsEkström Hagevall, Adam, Wikström, Carl
PublisherUppsala universitet, Avdelningen för datorteknik, Uppsala universitet, Avdelningen för datorteknik
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUPTEC STS, 1650-8319 ; 21008

Page generated in 0.0096 seconds