This thesis deals with the clustering of tunnels in data obtained from the protein molecular dynamics simulation. This process is very computationaly intensive and it has been a challenge for scientific communities. The goal is to find such an algorithm with optimal time and space complexity ratio. The research of clustering algorithms, work with huge highdimensional datasets, visualisation and cluster-comparing methods are discussed. The thesis provides a proposal of the solution of this problem using the Twister Tries algorithm. The implementation details are analysed and the testing results of the solution quality and space complexity are provided. The goal of the thesis was to prove that we could achieve the same results with a stochastic algorithm - Twister Tries , as with an exact algorithm ( average-linkage ). This assumption was not confirmed confidently. Another finding of the hashing functions analysis shows that we could obtain the same results of hashing with a low dimensional hashing function but in much better computational time.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:255375 |
Date | January 2016 |
Creators | Jaroš, Marta |
Contributors | Vašíček, Zdeněk, Martínek, Tomáš |
Publisher | Vysoké učení technické v Brně. Fakulta informačních technologií |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0018 seconds