Pattern matching, which is the process of finding a given pattern in a given text, is widely used in areas such as search and replace functions in text processing programs or in DNA sequence analysis, where the pattern can be a search term or a specific sequence of characters. Finding and analysing nucleic acid sequences in DNA data can in some cases require sequences to be found which in turn are made up of several specific sub sequences, where the nucleotides between them, as well as the number of them, are irrelevant. This pattern, also called a word chain, can more efficiently be found by pre-processing the pattern and text. This thesis explores, investigates and presents a data structure, used to match a word chain pattern, with the ability to incrementally alter this pre-computed information in order to more efficiently, time wise, handle text alterations such as split and concatenation operations.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:umu-226701 |
Date | January 2024 |
Creators | Nilsson, Wilmer |
Publisher | Umeå universitet, Institutionen för datavetenskap |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | UMNAD ; 1481 |
Page generated in 0.0019 seconds