Global ETD Search

691	Automatic speech segmentation with limited data / by D.R. van Niekerk Van Niekerk, Daniel Rudolph January 2009 (has links) The rapid development of corpus-based speech systems such as concatenative synthesis systems for under-resourced languages requires an efﬁcient, consistent and accurate solution with regard to phonetic speech segmentation. Manual development of phonetically annotated corpora is a time consuming and expensive process which suffers from challenges regarding consistency and reproducibility, while automation of this process has only been satisfactorily demonstrated on large corpora of a select few languages by employing techniques requiring extensive and specialised resources. In this work we considered the problem of phonetic segmentation in the context of developing small prototypical speech synthesis corpora for new under-resourced languages. This was done through an empirical evaluation of existing segmentation techniques on typical speech corpora in three South African languages. In this process, the performance of these techniques were characterised under different data conditions and the efﬁcient application of these techniques were investigated in order to improve the accuracy of resulting phonetic alignments. We found that the application of baseline speaker-speciﬁc Hidden Markov Models results in relatively robust and accurate alignments even under extremely limited data conditions and demonstrated how such models can be developed and applied efﬁciently in this context. The result is segmentation of sufﬁcient quality for synthesis applications, with the quality of alignments comparable to manual segmentation efforts in this context. Finally, possibilities for further automated reﬁnement of phonetic alignments were investigated and an efﬁcient corpus development strategy was proposed with suggestions for further work in this direction. / Thesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009. Phonetic speech segmentation Phonetic alignment Speech synthesis Text-to-speech Speech corpus development Resource scarce languages Hidden Markov models Dynamic time warping
692	Optimal Control and Estimation of Stochastic Systems with Costly Partial Information Kim, Michael J. 31 August 2012 (has links) Stochastic control problems that arise in sequential decision making applications typically assume that information used for decision-making is obtained according to a predetermined sampling schedule. In many real applications however, there is a high sampling cost associated with collecting such data. It is therefore of equal importance to determine when information should be collected as it is to decide how this information should be utilized for optimal decision-making. This type of joint optimization has been a long-standing problem in the operations research literature, and very few results regarding the structure of the optimal sampling and control policy have been published. In this thesis, the joint optimization of sampling and control is studied in the context of maintenance optimization. New theoretical results characterizing the structure of the optimal policy are established, which have practical interpretation and give new insight into the value of condition-based maintenance programs in life-cycle asset management. Applications in other areas such as healthcare decision-making and statistical process control are discussed. Statistical parameter estimation results are also developed with illustrative real-world numerical examples. Operations Research Industrial Engineering Applied Probability Stochastic Control Optimal Stopping Hidden Markov Model Maximum Likelihood Estimation 0796 0546
693	Optimal Control and Estimation of Stochastic Systems with Costly Partial Information Kim, Michael J. 31 August 2012 (has links) Stochastic control problems that arise in sequential decision making applications typically assume that information used for decision-making is obtained according to a predetermined sampling schedule. In many real applications however, there is a high sampling cost associated with collecting such data. It is therefore of equal importance to determine when information should be collected as it is to decide how this information should be utilized for optimal decision-making. This type of joint optimization has been a long-standing problem in the operations research literature, and very few results regarding the structure of the optimal sampling and control policy have been published. In this thesis, the joint optimization of sampling and control is studied in the context of maintenance optimization. New theoretical results characterizing the structure of the optimal policy are established, which have practical interpretation and give new insight into the value of condition-based maintenance programs in life-cycle asset management. Applications in other areas such as healthcare decision-making and statistical process control are discussed. Statistical parameter estimation results are also developed with illustrative real-world numerical examples. Operations Research Industrial Engineering Applied Probability Stochastic Control Optimal Stopping Hidden Markov Model Maximum Likelihood Estimation 0796 0546
694	Vision-Based Observation Models for Lower Limb 3D Tracking with a Moving Platform Hu, Richard Zhi Ling January 2011 (has links) Tracking and understanding human gait is an important step towards improving elderly mobility and safety. This thesis presents a vision-based tracking system that estimates the 3D pose of a wheeled walker user's lower limbs with cameras mounted on the moving walker. The tracker estimates 3D poses from images of the lower limbs in the coronal plane in a dynamic, uncontrolled environment. It employs a probabilistic approach based on particle filtering with three different camera setups: a monocular RGB camera, binocular RGB cameras, and a depth camera. For the RGB cameras, observation likelihoods are designed to compare the colors and gradients of each frame with initial templates that are manually extracted. Two strategies are also investigated for handling appearance change of tracking target: increasing number of templates and using different representations of colors. For the depth camera, two observation likelihoods are developed: the first one works directly in the 3D space, while the second one works in the projected image space. Experiments are conducted to evaluate the performance of the tracking system with different users for all three camera setups. It is demonstrated that the trackers with the RGB cameras produce results with higher error as compared to the depth camera, and the strategies for handling appearance change improve tracking accuracy in general. On the other hand, the tracker with the depth sensor successfully tracks the 3D poses of users over the entire video sequence and is robust against unfavorable conditions such as partial occlusion, missing observations, and deformable tracking target. HMM hidden markov model observation model color gradient 3D tracking Kinect RGB camera color space lighting change particle filter kinesiology Computer Science
695	Bayesian models and algoritms for protein secondary structure and beta-sheet prediction Aydin, Zafer 17 September 2008 (has links) In this thesis, we developed Bayesian models and machine learning algorithms for protein secondary structure and beta-sheet prediction problems. In protein secondary structure prediction, we developed hidden semi-Markov models, N-best algorithms and training set reduction procedures for proteins in the single-sequence category. We introduced three residue dependency models (both probabilistic and heuristic) incorporating the statistically significant amino acid correlation patterns at structural segment borders. We allowed dependencies to positions outside the segments to relax the condition of segment independence. Another novelty of the models is the dependency to downstream positions, which is important due to asymmetric correlation patterns observed uniformly in structural segments. Among the dataset reduction methods, we showed that the composition based reduction generated the most accurate results. To incorporate non-local interactions characteristic of beta-sheets, we developed two N-best algorithms and a Bayesian beta-sheet model. In beta-sheet prediction, we developed a Bayesian model to characterize the conformational organization of beta-sheets and efficient algorithms to compute the optimum architecture, which includes beta-strand pairings, interaction types (parallel or anti-parallel) and residue-residue interactions (contact maps). We introduced a Bayesian model for proteins with six or less beta-strands, in which we model the conformational features in a probabilistic framework by combining the amino acid pairing potentials with a priori knowledge of beta-strand organizations. To select the optimum beta-sheet architecture, we analyzed the space of possible conformations by efficient heuristics, in which we significantly reduce the search space by enforcing the amino acid pairs that have strong interaction potentials. For proteins with more than six beta-strands, we first computed beta-strand pairings using the BetaPro method. Then, we computed gapped alignments of the paired beta-strands in parallel and anti-parallel directions and chose the interaction types and beta-residue pairings with maximum alignment scores. Accurate prediction of secondary structure, beta-sheets and non-local contacts should improve the accuracy and quality of the three-dimensional structure prediction. Bayesian models Machine learning Hidden Markov models Contact map prediction Protein beta-sheet prediction Protein secondary structure prediction Molecular biology Amino acid sequence Bioinformatics
696	An ensemble speaker and speaking environment modeling approach to robust speech recognition Tsao, Yu 18 November 2008 (has links) In this study, an ensemble speaker and speaking environment modeling (ESSEM) approach is proposed to characterize environments in order to enhance performance robustness of automatic speech recognition (ASR) systems under adverse conditions. The ESSEM process comprises two stages, the offline and online phases. In the offline phase, we prepare an ensemble speaker and speaking environment space formed by a collection of super-vectors. Each super-vector consists of the entire set of means from all the Gaussian mixture components of a set of hidden Markov Models that characterizes a particular environment. In the online phase, with the ensemble environment space prepared in the offline phase, we estimate the super-vector for a new testing environment based on a stochastic matching criterion. A series of techniques is proposed to further improve the original ESSEM approach on both offline and online phases. For the offline phase, we focus on methods to enhance the construction and coverage of the environment space. We first demonstrate environment clustering and environment partitioning algorithms to well structure the environment space; then, we propose a discriminative training algorithm to enhance discrimination across environment super-vectors and therefore broaden the coverage of the ensemble environment space. For the online phase, we study methods to increase the efficiency and precision in estimating the target super-vector for the testing condition. To enhance the efficiency, we incorporate dimensionality reduction techniques to reduce the complexity of the original environment space. To improve the precision, we first study different forms of mapping function and propose a weighted N-best information technique; then, we propose cohort selection, environment space adaptation and multiple cluster matching algorithms to facilitate the environment characterization. We evaluate the proposed ESSEM framework on the Aurora-2 connected digit recognition task. Experimental results verify that the original ESSEM approach already provides clear improvement over a baseline system without environment compensation. Moreover, the performance of ESSEM can be further enhanced by using the proposed offline and online algorithms. A significant improvement of 16.08% word error rate reduction is achieved by ESSEM with optimal offline and online configuration over our best baseline system on the Aurora-2 task. ESSEM Stochastic matching Noise robustness Environment modeling Automatic speech recognition Speech processing systems Hidden Markov models
697	Unsupervised and semi-supervised training methods for eukaryotic gene prediction Ter-Hovhannisyan, Vardges 17 November 2008 (has links) This thesis describes new gene finding methods for eukaryotic gene prediction. The current methods for deriving model parameters for gene prediction algorithms are based on curated or experimentally validated set of genes or gene elements. These training sets often require time and additional expert efforts especially for the species that are in the initial stages of genome sequencing. Unsupervised training allows determination of model parameters from anonymous genomic sequence with. The importance and the practical applicability of the unsupervised training is critical for ever growing rate of eukaryotic genome sequencing. Three distinct training procedures are developed for diverse group of eukaryotic species. GeneMark-ES is developed for species with strong donor and acceptor site signals such as Arabidopsis thaliana, Caenorhabditis elegans and Drosophila melanogaster. The second version of the algorithm, GeneMark-ES-2, introduces enhanced intron model to better describe the gene structure of fungal species with posses with relatively weak donor and acceptor splice sites and well conserved branch point signal. GeneMark-LE, semi-supervised training approach is designed for eukaryotic species with small number of introns. The results indicate that the developed unsupervised training methods perform well as compared to other training methods and as estimated from the set of genes supported by EST-to-genome alignments. Analysis of novel genomes reveals interesting biological findings and show that several candidates of under-annotated and over-annotated fungal species are present in the current set of annotated of fungal genomes. Hidden markov models Self-training Gene annotation Genome annotation Viterbi algorithm Unsupervised training Gene prediction Gene finding Eukaryotic cells Genetics Algorithms
698	Two Optimization Problems in Genetics : Multi-dimensional QTL Analysis and Haplotype Inference Nettelblad, Carl January 2012 (has links) The existence of new technologies, implemented in efficient platforms and workflows has made massive genotyping available to all fields of biology and medicine. Genetic analyses are no longer dominated by experimental work in laboratories, but rather the interpretation of the resulting data. When billions of data points representing thousands of individuals are available, efficient computational tools are required. The focus of this thesis is on developing models, methods and implementations for such tools. The first theme of the thesis is multi-dimensional scans for quantitative trait loci (QTL) in experimental crosses. By mating individuals from different lines, it is possible to gather data that can be used to pinpoint the genetic variation that influences specific traits to specific genome loci. However, it is natural to expect multiple genes influencing a single trait to interact. The thesis discusses model structure and model selection, giving new insight regarding under what conditions orthogonal models can be devised. The thesis also presents a new optimization method for efficiently and accurately locating QTL, and performing the permuted data searches needed for significance testing. This method has been implemented in a software package that can seamlessly perform the searches on grid computing infrastructures. The other theme in the thesis is the development of adapted optimization schemes for using hidden Markov models in tracing allele inheritance pathways, and specifically inferring haplotypes. The advances presented form the basis for more accurate and non-biased line origin probabilities in experimental crosses, especially multi-generational ones. We show that the new tools are able to reconstruct haplotypes and even genotypes in founder individuals and offspring alike, based on only unordered offspring genotypes. The tools can also handle larger populations than competing methods, resolving inheritance pathways and phase in much larger and more complex populations. Finally, the methods presented are also applicable to datasets where individual relationships are not known, which is frequently the case in human genetics studies. One immediate application for this would be improved accuracy for imputation of SNP markers within genome-wide association studies (GWAS). / eSSENCE quantitative trait loci genome-wide association studies hidden Markov models numerical optimization linkage analysis haplotype inference genotype imputation high performance computing
699	Human Action Recognition In Video Data For Surveillance Applications Gurrapu, Chaitanya January 2004 (has links) Detecting human actions using a camera has many possible applications in the security industry. When a human performs an action, his/her body goes through a signature sequence of poses. To detect these pose changes and hence the activities performed, a pattern recogniser needs to be built into the video system. Due to the temporal nature of the patterns, Hidden Markov Models (HMM), used extensively in speech recognition, were investigated. Initially a gesture recognition system was built using novel features. These features were obtained by approximating the contour of the foreground object with a polygon and extracting the polygon's vertices. A Gaussian Mixture Model (GMM) was fit to the vertices obtained from a few frames and the parameters of the GMM itself were used as features for the HMM. A more practical activity detection system using a more sophisticated foreground segmentation algorithm immune to varying lighting conditions and permanent changes to the foreground was then built. The foreground segmentation algorithm models each of the pixel values using clusters and continually uses incoming pixels to update the cluster parameters. Cast shadows were identified and removed by assuming that shadow regions were less likely to produce strong edges in the image than real objects and that this likelihood further decreases after colour segmentation. Colour segmentation itself was performed by clustering together pixel values in the feature space using a gradient ascent algorithm called mean shift. More robust features in the form of mesh features were also obtained by dividing the bounding box of the binarised object into grid elements and calculating the ratio of foreground to background pixels in each of the grid elements. These features were vector quantized to reduce their dimensionality and the resulting symbols presented as features to the HMM to achieve a recognition rate of 62% for an event involving a person writing on a white board. The recognition rate increased to 80% for the &quotseen" person sequences, i.e. the sequences of the person used to train the models. With a fixed lighting position, the lack of a shadow removal subsystem improved the detection rate. This is because of the consistent profile of the shadows in both the training and testing sequences due to the fixed lighting positions. Even with a lower recognition rate, the shadow removal subsystem was considered an indispensable part of a practical, generic surveillance system. Event Detection Hidden Markov Models Gesture Recognition Shadow Elimination Gaussian Mixture Models GMM Front End Vertex Features High Curvatures Points Contour Information Human Body Spatial Distribution Metrics
700	Statistical signal processing in sensor networks with applications to fault detection in helicopter transmissions Galati, F. Antonio Unknown Date (has links) (PDF) In this thesis two different problems in distributed sensor networks are considered. Part I involves optimal quantiser design for decentralised estimation of a two-state hidden Markov model with dual sensors. The notion of optimality for quantiser design is based on minimising the probability of error in estimating the hidden Markov state. Equations for the filter error are derived for the continuous (unquantised) sensor outputs (signals), which are used to benchmark the performance of the quantisers. Minimising the probability of filter error to obtain the quantiser breakpoints is a difficult problem therefore an alternative method is employed. The quantiser breakpoints are obtained by maximising the mutual information between the quantised signals and the hidden Markov state. This method is known to work well for the single sensor case. Cases with independent and correlated noise across the signals are considered. The method is then applied to Markov processes with Gaussian signal noise, and further investigated through simulation studies. Simulations involving both independent and correlated noise across the sensors are performed and a number of interesting new theoretical results are obtained, particularly in the case of correlated noise. In Part II, the focus shifts to the detection of faults in helicopter transmission systems. The aim of the investigation is to determine whether the acoustic signature can be used for fault detection and diagnosis. To investigate this, statistical change detection algorithms are applied to acoustic vibration data obtained from the main rotor gearbox of a Bell 206 helicopter, which is run at high load under test conditions.

Search results