Return to search

Efficient Mixed-Order Hidden Markov Model Inference

Thesis (PhD (Electrical and Electronic Engineering))--University of Stellenbosch, 2007. / Higher-order Markov models are more powerful than first-order models, but
suffer from an exponential increase in model parameters with order, which leads
to data scarcity problems during training. A more efficient approach is to use
mixed-order Markov models, which model data sequences with contexts of different
lengths.
This study proposes two algorithms for inferring mixed-order Markov chains
and hidden Markov models (HMMs), respectively. The basis of these algorithms
is the prediction suffix tree (PST), an efficient representation of a mixed-order
Markov chain.
The smallest encoded context tree (SECT) algorithm constructs PSTs from
data, based on the minimum description length principle. It has no user-specifiable
parameters to tune, and will expand the depth of the resulting PST as far as
the data set allows it, making it a self-bounded algorithm. It is also faster than
the original PST inference algorithm.
The hidden SECT algorithm replaces the underlying Markov chain of an
HMM with a prediction suffix tree, which is inferred using SECT. The algorithm
is efficient and integrates well with standard techniques.
The properties of the SECT and hidden SECT algorithms are verified on synthetic
data. The hidden SECT algorithm is also compared with a fixed-order
HMM training algorithm on an automatic language recognition task, where the
resulting mixed-order HMMs are shown to be smaller and train faster than the
fixed-order models, for similar classification accuracies.

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:sun/oai:scholar.sun.ac.za:10019.1/1340
Date12 1900
CreatorsSchwardt, Ludwig
ContributorsDu Preez, J. A., University of Stellenbosch. Faculty of Engineering. Dept. of Electrical and Electronic Engineering.
PublisherStellenbosch : University of Stellenbosch
Source SetsSouth African National ETD Portal
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Format2736542 bytes, application/pdf
RightsUniversity of Stellenbosch

Page generated in 0.0021 seconds