Return to search

Protecting Privacy: Automatic Compression and Encryption of Next-Generation Sequencing Alignment Data

As the field of next-generation sequencing (NGS) matures and the technology grows more advanced, it is becoming an increasingly strong tool for solving various biological problems. Harvesting and analysing the full genomic sequence of an individual and comparing it to a reference genome can unravel information about detrimental mutations, in particular ones that give rise to diseases such as cancer. At the Rudbeck Laboratory, Uppsala University, a fully automatic software pipeline for somatic mutational analysis of cancer patient sequence data is in development. This will increase the efficiency and accuracy of a process which today consists of several discrete computation steps. In turn, this will reduce the time to result and facilitate the process of making a diagnosis and delegate the optimal treatment for the patient. However, the genomic data of an individual is very sensitive and private, which demands that great security precautions are taken. Moreover, as more and more data are produced storage space is becoming increasingly valuable, which requires that data are handled and stored as efficiently as possible. In this project, I developed a Python pipeline for automatic compression and encryption of NGS alignment data, which aims to ensure full privacy protection of patient data while maintaining high computational and storage efficiency. The pipeline uses a state-of-the-art real-time compression algorithm combined with an Advanced Encryption Standard cipher. It offers security that meets rigorous modern standards, and performance which at least matches that of existing solutions. The system is made to be easily integrated in the somatic mutation analysis pipeline. This way, the data generated during the analysis, which are too large to be kept in operational memory, can safely be stored to disk.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-386413
Date January 2019
CreatorsGustafsson, Wiktor
PublisherUppsala universitet, Experimentell och klinisk onkologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUPTEC X ; 19010

Page generated in 0.0029 seconds