Global ETD Search

Return to search

Neural networks for imputation of missing genotype data : An alternative to the classical statistical methods in bioinformatics

In this project, two different machine learning models were tested in an attempt at imputing missing genotype data from patients on two different panels. As the integrity of the patients had to be protected, initial training was done on data simulated from the 1000 Genomes Project. The first model consisted of two convolutional variational autoencoders and the latent representations of the networks were shuffled to force the networks to find the same patterns in the two datasets. This model was unfortunately unsuccessful at imputing the missing data. The second model was based on a UNet structure and was more successful at the task of imputation. This model had one encoder for each dataset, making each encoder specialized at finding patterns in its own data. Further improvements are required in order for the model to be fully capable at imputing the missing data.

http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-413635

Bioinformatics (Computational Biology)

Bioinformatik (beräkningsbiologi)

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-413635
Date	January 2020
Creators	Andersson, Alfred
Publisher	Uppsala universitet, Institutionen för biologisk grundutbildning
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess
Relation	UPTEC X ; 20018

Page generated in 0.002 seconds

Neural networks for imputation of missing genotype data : An alternative to the classical statistical methods in bioinformatics

Description

Links & Downloads

Tags

Additional Fields