Group comparison per se is a fundamental task in many scientific endeavours but is also the basis of any classifier. Comparing groups of sequence data is a relevant task. To contrast sequence groups, we define Emerging Sequences (ESs) as subsequences that are frequent in sequences of one group and less frequent in another, and thus distinguishing sequences of different classes.
There are two challenges to distinguish sequence classes by ESs: the extraction of ESs is not trivially efficient and only exact matches of sequences are considered. In our work we address those problems by a suffix tree-based framework and a sliding window matching mechanism. A classification model based on ESs is also proposed.
Evaluating against several other learning algorithms, the experiments on two datasets show that our similar ESs-based classification model outperforms the baseline approaches. With the ESs' high discriminative power, our proposed model achieves satisfactory F-measures on classifying sequences.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:AEU.10048/628 |
Date | 11 1900 |
Creators | Deng, Kang |
Contributors | Osmar R. Zaiane, Computing Science, Scott Dick, Electrical and Computer Engineering, Paul Lu, Computing Science |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English |
Detected Language | English |
Type | Thesis |
Format | 365794 bytes, application/pdf |
Relation | Kang Deng, Osmar R. Zaiane, Contrasting Sequence Groups by Emerging Sequences, International Conference on Discovery Science, 2009 |
Page generated in 0.0017 seconds