Return to search

Automation of a Data Analysis Pipeline for High-content Screening Data

High-content screening is a part of the drug discovery pipeline dealing with the identification of substances that affect cells in a desired manner. Biological assays with a large set of compounds are developed and screened and the output is generated with a multidimensional structure. Data analysis is performed manually by an expert with a set of tools and this is considered to be too time consuming and unmanageable when the amount of data grows large. This thesis therefore investigates and proposes a way of automating the data analysis phase through a set of machine learning algorithms. The resulting implementation is a cloud based application that can support the user with the selection of which features that are relevant for further analysis. It also provides techniques for automated processing of the dataset and training of classification models which can be utilised for predicting sample labels. An investigation of the workflow for analysing data was conducted before this thesis. It resulted in a pipeline that maps the different tools and software to what goal they fulfil and which purpose they have for the user. This pipeline was then compared with a similar pipeline but with the implemented application included. This comparison demonstrates clear advantages in contrast to previous methodologies in that the application will provide support to work in a more automated way of performing data analysis.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-122913
Date January 2015
CreatorsBergström, Simon, Ivarsson, Oscar
PublisherLinköpings universitet, Medie- och Informationsteknik, Linköpings universitet, Tekniska högskolan, Linköpings universitet, Medie- och Informationsteknik, Linköpings universitet, Tekniska högskolan
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0023 seconds