This thesis presents specification and implementation of a system for detection of enzymes in metagenomic data. The detection is based on a provided enzyme sequence and its goal is to search the metagenomic sample for its novel variants. In order to guarantee that found enzymes truly have the desired catalytic function, the system employs a number of catalytic function verification methods. Their specification, implementation and evaluation is one of the main contributions of this thesis. Experiments have shown, that proposed methods reach sensitivity as high as 89%, specificity of 95%, values of AUC metric above 0.9 and average throughput of 1,203 verifications per second on regular personal computer. Evaluation of the system also led to discovery of a partial sequence of novel haloalkane dehalogenase enzyme in a metagenomic sample from soil. The implementation is able to work on a personal computer as well as on a grid computing environment.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:363906 |
Date | January 2017 |
Creators | Smatana, Stanislav |
Contributors | Martínek, Tomáš, Hon, Jiří |
Publisher | Vysoké učení technické v Brně. Fakulta informačních technologií |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0053 seconds