Global ETD Search

1	An acoustic model for speech recognition with an articulatory layer and non-linear articulatory-to-acoustic mapping Lo, Boon Hooi January 2004 (has links) This thesis presents an extended hidden Markov Model (HMM), namely the linear/non-linear multi-level segmental hidden Markov model (linear/non-linear MSHMM). In the MSHMM framework, the relationship between symbolic and acoustic representations of a speech signal is regulated by an intermediate, articulatory-based layer. Such an approach has many potential advantages for speech pattern processing. By modelling speech dynamics directly in an articulatory domain, it may be possible to characterise the articulatory phenomena which give rise to variability in speech. The intermediate representations are based on the first three formant frequencies. The speech dynamics in the formant representation of each segment are modelled as fixed linear trajectories which characterise the distribution of formant frequencies. These trajectories are mapped into the acoustic features space by set of one or more non-linear mappings. Hence, comes the name linear/non-linear MSHMM. This thesis describes work developing a non-linear transformation approach using a nonlinear Radial Basis Function (RBF) network for the articulatory-to-acoustic mapping. A RBF network consists of a number of hidden units and mapping weights for linear transform component of the network. Each hidden unit is associated with a 'Gaussian-like' distribution. The thesis presents the training and optimisation processes for the parameters of the RBF network. The linear/non-linear MSHMMs, which form the basis for the thesis, are incorporated into an automatic speech recognition system. Gradient descent process is used to find the optimal parameters of the linear trajectory models during Viterbi training process. The phone classification experiments are presented for monophone MSHMMs using TEVflT database. The linear/non-linear MSHMM is compared with the linear/linear MSHMM, where both the model of dynamics and the articulatory-to-acoustic mappings are linear. The comparison results show no statistically significant difference in performance between these two models. 006.4
2	Incorporating duration information in activity recognition Chaurasia, Priyanka January 2013 (has links) Activity recognition is a key component of patient management in smart homes where high-level activities can be learned from low-level sensor data. Different activities have different durations. In addition different people may take different amounts of time to complete the same activity. Activity duration information can therefore be considered as a potentially useful feature in assessing user health and cognitive status, and in distinguishing between different activities. The objective of this thesis is to develop methods that incorporate duration-based information in activity recognition and thus improve activity prediction performance. Activity duration information has been integrated in an existing probabilistic model and improvements in activity recognition analysed. For the purpose of computational modelling, duration data were discretised. A probabilistic learning model was built using the joint probability distribution over different activities, representing behavioural patterns of the users in performing a range of activities. Each activity was predicted based on the conditional probability of the activity given the sequence of sensor activations, the time of activation and the duration of the activity. The built model demonstrated nearly 2% improvement in the prediction of activities when duration information was included. The derived model with enhanced recognition capability motivated the development of a duration-based decision making framework for a potential online support tool. The aim was to combine two incomplete aspects of online sensor data: incomplete activity duration and partially observed sensor activations within such a framework. The two aspects, when integrated can improve the online prediction of user activity. As an activity progresses, these two aspects change over time; hence the prediction of the current activity will also change accordingly. Further work related to activity durations involved exploring different clustering approaches for the purpose of discretisation of duration data related to a set of activities and automation of the discretisation process. The work also addressed issues associated with the discretisation problem when working with a dataset of limited size, where prediction of the statistical model parameters is difficult. In summary, the research presented in this thesis contributes to methodologies to enhance activity recognition systems for smart homes based on the incorporation of activity duration information. The advantage of employing activity duration data in activity recognition was also demonstrated on datasets from external smart home environments, where different activities were distinguished based on durations along with other sensor attributes. 006.4
3	A chronometric study of the scanning of visual representations Beech, John R. January 1979 (has links) No description available. 006.4
4	An integrated approach to speech recognition using phrase-based units Watkins, Christopher James January 2010 (has links) In human-to-human dialogue, formulaic sequences are used to minimise the effort of both speech production and perception in the conversation. In production, the speaker apparently retrieves such sequences whole from memory, without the cognitive effort required for generation from a lexicon and grammar. In perception, context determines a set of similar phrases that the listener expects to hear, and this also reduces cognitive load. This thesis describes techniques used to automatically acquire formulaic phrases from transcriptions of speech, which are then used to define variable-length units of speech and language. These are well suited for use in a template-based speech recogniser, which can easily adjust its modelling units for the examples that are found, with the aim of improving Automatic Speech Recognition (ASR) accuracy. Language modelling techniques are described, such as the Word Phrase Link Bi- gram (WPLB) language model, which combines words and phrases together, and the Hybrid Syntactic Formulaic (HSF), which clusters semantically similar phrases using syntax. The language models are then combined with speech, in both Hidden Markov Model and template-based speech recognisers. Techniques to reduce the complexity of the search space for the template-based recogniser are introduced, such as the hierarchical LDA filter. As expected, the techniques gave significant gains when the language used was highly formulaic, and were less successful on a “standard” speech database which consisted of highly artificial utterances. 006.4
5	Inductive confidence machine for pattern recognition Surkov, David January 2004 (has links) No description available. 006.4
6	On the flexibility of theoretical models for pattern recognition Riabko, Daniil January 2005 (has links) No description available. 006.4
7	Pattern classification using spread spectrum El-Helw, Amr M. January 2008 (has links) No description available. 006.4
8	Towards efficient texture classification and abnormality detection Monadjemi, Amirhassan January 2005 (has links) No description available. 006.4
9	Three dimensional visualisation and quanitiative characterisation of combusition flames Bheemul, Harrish Chandr January 2005 (has links) No description available. 006.4
10	Improved shape from shading using non-Lambertian reflectance models Ragheb, Hossein January 2004 (has links) No description available. 006.4

Search results