Uncertainty is prevalent in diverse datasets. This uncertainty is modelled by a string generalization termed as indeterminate string – a string containing one or more sub- sets of the alphabet as letters (known as indeterminate letters). In this thesis we revisit the pattern matching problem on indeterminate strings. We introduce innovative algorithms leveraging established techniques like KMP and BM, coupled with an exhaustive experimental evaluation focusing on both time complexity and runtime performance. Additionally, the thesis explores a novel encoding methodology for indeterminate strings, assessing its impact on runtime efficiency. Through rigorous analysis and experimentation, this study not only expands the theoretical framework of indeterminate pattern matching but also provides practical insights that will impact data processing in real-world applications. / Thesis / Master of Science (MSc) / In my thesis, I proposed novel algorithms for pattern matching on indeterminate strings — special strings that allow character uncertainties at specific positions. By addressing uncertainties in character positions, my work has implications in computa- tional biology, data mining, and various applications, with more precise and efficient pattern recognition in real-world scenarios.
Identifer | oai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/29309 |
Date | January 2024 |
Creators | Dehghani, Hossein |
Contributors | Mhaskar, Neerja, Computing and Software |
Source Sets | McMaster University |
Language | en_US |
Detected Language | English |
Type | Thesis |
Page generated in 0.0021 seconds