Return to search

STAIRS : Data reduction strategy on genomics

Background. An enormous accumulation of genomic data has been taking place over the last ten years. This makes the activities of visualization and manual inspection, key steps in trying to understand large datasets containing DNA sequences with millions of letters. This situation has created a gap between data complexity and qualified personnel due to the need of trading between visualization, reduction capacity and exploratory functions, features rarely achieved by existing tools, such as SRA toolkit (https://www.ncbi.nlm.nih.gov/sra/docs/toolkitsoft/), for instance. A novel approach to the problem of genomic analysis and visualization was pursued in this project, by means of STrAtified Interspersed Reduction Structures (STAIRS). Result. Ten weeks of intense work resulted in novel algorithms to compress data, transform it into stairs vectors and align them. Smith–Waterman and Needleman–Wunsch algorithms have been specially modified for this purpose and the application brought about statistical performance and behavioural charts.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-383465
Date January 2019
CreatorsFerrer, Samuel
PublisherUppsala universitet, Institutionen för biologisk grundutbildning
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0023 seconds