This thesis describes a new approach to the detection of protein stability change upon amino acid mutations. The main goal is to create a new meta-tool, which combines the outputs of eight well-established prediction tools and due to suitable method of consensus making, it is able to improve the overall prediction accuracy. The optimal strategy of combination of outputs of these tools is found by using a various number of machine learning methods. From all tested machine learning methods, KStar showed the highest prediction accuracy on the training dataset compiled from experimentally validated mutations originating from ProTherm database. Due to this reason, it is chosen as an optimal prediction technique. The general prediction abilities is validated on the testing dataset composed of multi-point amino acid mutations extracted also from ProTherm database. Since the multi-point mutations were not used for training any of integrated tools, we suppose that such comparison is objective. As a result, the developed meta-tool based on KStar technique improves the correlation coefficient about 0.130 on the training dataset and 0.239 on the testing dataset, respectively (the comparison is being made against the most succesful integrated tool). Based on the obtained results, it is possible to claim that machine learning methods are suitable technique for the problems from area of protein predictions.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:236035 |
Date | January 2014 |
Creators | Malinka, František |
Contributors | Martínek, Tomáš, Bendl, Jaroslav |
Publisher | Vysoké učení technické v Brně. Fakulta informačních technologií |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0018 seconds