• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Increasing Reproducibility Through Provenance, Transparency and Reusability in a Cloud-Native Application for Collaborative Machine Learning

Ekström Hagevall, Adam, Wikström, Carl January 2021 (has links)
The purpose of this thesis paper was to develop new features in the cloud-native and open-source machine learning platform STACKn, aiming to strengthen the platform's support for conducting reproducible machine learning experiments through provenance, transparency and reusability. Adhering to the definition of reproducibility as the ability of independent researchers to exactly duplicate scientific results with the same material as in the original experiment, two concepts were explored as alternatives for this specific goal: 1) Increased support for standardized textual documentation of machine learning models and their corresponding datasets; and 2) Increased support for provenance to track the lineage of machine learning models by making code, data and metadata readily available and stored for future reference. We set out to investigate to what degree these features could increase reproducibility in STACKn, both when used in isolation and when combined.  When these features had been implemented through an exhaustive software engineering process, an evaluation of the implemented features was conducted to quantify the degree of reproducibility that STACKn supports. The evaluation showed that the implemented features, especially provenance features, substantially increase the possibilities to conduct reproducible experiments in STACKn, as opposed to when none of the developed features are used. While the employed evaluation method was not entirely objective, these features are clearly a good first initiative in meeting current recommendations and guidelines on how computational science can be made reproducible.

Page generated in 0.0235 seconds