Spelling suggestions: "subject:"markov model."" "subject:"darkov model.""
271 |
An Analog Architecture for Auditory Feature Extraction and RecognitionSmith, Paul Devon 22 November 2004 (has links)
Speech recognition systems have been implemented using a wide range of signal processing techniques including
neuromorphic/biological inspired and Digital Signal Processing
techniques. Neuromorphic/biologically inspired techniques, such as silicon cochlea models, are based on fairly simple yet highly parallel computation and/or computational units. While the area of digital signal processing (DSP) is based on block transforms and statistical or error minimization methods.
Essential to each of these techniques is the first stage of
extracting meaningful information from the speech signal, which is known as feature extraction. This can be done using biologically inspired techniques such as silicon cochlea models, or techniques beginning with a model of speech production and then trying to separate the the vocal tract response from an excitation signal. Even within each of these approaches, there are multiple techniques including cepstrum filtering, which sits
under the class of Homomorphic signal processing, or techniques using FFT based predictive approaches. The underlying reality is there are multiple techniques that have attacked the problem in speech recognition but the problem is still far from being solved. The techniques that have shown to have the best recognition rates involve Cepstrum Coefficients for the feature extraction and Hidden-Markov Models to perform the pattern recognition.
The presented research develops an analog system based on
programmable analog array technology that can perform the initial stages of auditory feature extraction and recognition before passing information to a digital signal processor. The goal being a low power system that can be fully contained on one or more integrated circuit chips. Results show that it is
possible to realize advanced filtering techniques such as
Cepstrum Filtering and Vector Quantization in analog circuitry. Prior to this work, previous applications of analog signal processing have focused on vision, cochlea models, anti-aliasing filters and other single component uses. Furthermore, classic designs have looked heavily at utilizing op-amps as a basic core building block for these designs. This research also shows a novel design for a Hidden Markov Model (HMM) decoder utilizing circuits that take advantage of the inherent properties of subthreshold transistors and floating-gate technology to create low-power computational blocks.
|
272 |
Turkish Large Vocabulary Continuous Speech Recognition By Using Limited Audio CorpusSusman, Derya 01 March 2012 (has links) (PDF)
Speech recognition in Turkish Language is a challenging problem in several perspectives. Most of the challenges are related to the morphological structure of the language. Since Turkish is an agglutinative language, it is possible to generate many words from a single stem by using suffixes. This characteristic of the language increases the out-of-vocabulary (OOV) words, which degrade the performance of a speech recognizer dramatically. Also, Turkish language allows words to be ordered in a free manner, which makes it difficult to generate robust language models.
In this thesis, the existing models and approaches which address the problem of Turkish LVCSR (Large Vocabulary Continuous Speech Recognition) are explored. Different recognition units (words, morphs, stem and endings) are used in
generating the n-gram language models. 3-gram and 4-gram language models are generated with respect to the recognition unit.
Since the solution domain of speech recognition is involved with machine learning, the performance of the recognizer depends on the sufficiency of the audio data used in acoustic model training. However, it is difficult to obtain rich audio corpora for
the Turkish language. In this thesis, existing approaches are used to solve the problem of Turkish LVCSR by using a limited audio corpus. We also proposed several data selection approaches in order to improve the robustness of the acoustic
model.
|
273 |
Discovering Discussion Activity Flows in an On-line Forum Using Data Mining TechniquesHsieh, Lu-shih 22 July 2008 (has links)
In the Internet era, more and more courses are taught through a course management system (CMS) or learning management system (LMS). In an asynchronous virtual learning environment, an instructor has the need to beware the progress of discussions in forums, and may intervene if ecessary in order to facilitate students¡¦ learning. This research proposes a discussion forum activity flow tracking system, called FAFT (Forum Activity Flow Tracer), to utomatically monitor the discussion activity flow of threaded forum postings in CMS/LMS. As CMS/LMS is getting popular in facilitating learning activities, the proposedFAFT can be used to facilitate instructors to identify students¡¦ interaction types in discussion forums.
FAFT adopts modern data/text mining techniques to discover the patterns of forum discussion activity flows, which can be used for instructors to facilitate the online learning activities. FAFT consists of two subsystems: activity classification (AC) and activity flow discovery (AFD). A posting can be perceived as a type of announcement, questioning, clarification, interpretation, conflict, or assertion. AC adopts a cascade model to classify various activitytypes of posts in a discussion thread. The empirical evaluation of the classified types from a repository of postings in earth science on-line courses in a senior high school shows that AC can effectively facilitate the coding rocess, and the cascade model can deal with the imbalanced distribution nature of discussion postings.
AFD adopts a hidden Markov model (HMM) to discover the activity flows. A discussion activity flow can be presented as a hidden Markov model (HMM) diagram that an instructor can adopt to predict which iscussion activity flow type of a discussion thread may be followed. The empirical results of the HMM from an online forum in earth science subject in a senior high school show that FAFT can effectively predict the type of a discussion activity flow. Thus, the proposed FAFT can be embedded in a course management system to automatically predict the activity flow type of a discussion thread, and in turn reduce the teachers¡¦ loads on managing online discussion forums.
|
274 |
Lietuvių kalbos atpažinimas, panaudojant Julius programinę įrangą / Speech Recognition of Lithuanian Using Julius SoftwareBraubartas, Ernestas 29 September 2008 (has links)
Signalų technologijų magistro darbo tema yra aktuali, nes nebepakanka įprastų informacijos įvedimo priemonių. Todėl ieškoti ir apdoroti informaciją, valdyti sudėtingus įrenginius ir programas daug patogiau būtų jei kompiuteriai ir įvairūs įrenginiai suprastų žmogaus kalbą. Pasaulyje panašios sistemos kuriamos jau daugelį metų. Tačiau šiuo metu lietuvių kalbos atpažinimo sistemos yra dar tik kūrimo stadijoje. Darbe nagrinėjamas Lietuvių kalbos žodžių atpažinimas skirstant juos į kategorijas ir naudojant paslėptuosius markovo modelius. Šio tyrimo tikslas – ištirti lietuvių kalbos žodžių skirstymo į kategorijas įtaką atpažinimo tikslumui.Taip pat tiriamas žodžių grupių bei pavienių žodžių atpažinimas. Akustinis modelis sukurtas su HTK paketu, kuris naudojasi paslėptųjų Markovo modelių metodika. Žodžių skirstymas į kategorijas aprašytas Backus-Naur formatu. Eksperimentai bus atliekami ir rezultatai gaunami naudojant, Julius programinės įrangos įrankius bei šio paketo, žodžių kategorijų pagrindu veikiančią, Julian kalbos atpažinimo sistemą. Geriausi rezultatai gauti bandant atpažinti pavienius žodžius suskirstytus į kategorijas. Atpažinimo tikslumas siekia 91 %. Bandant atpažinti žodžių sekas, nesuskirstytas į kategorijas, gautas atpažinimo tikslumas tesiekia 51 %. Microsoft Office Word 2003 meniu valdymo atpažinimo tikslumas siekia 82 %. / The theme of Master project of signal technology is actual, because not enough usual information introduction ways. Therefore information search and processing, complicated devices and programs control would be more handily if computers and devices understood human speech. Similar systems are designing for many years in the world. However Lithuanian speech recognition systems are still developing in nowadays. The thesis treats of isolated Lithuanian words recognition dividing them into category and using Hidden Markov Models. The idea of research is to explore categorization of Lithuanian words influence on the accuracy of recognition. The recognition of single words and word groups is under research too. Acoustic model is constructed by using HTK toolkit which is based on Hidden Markov Models. Categorization of words is described with Backus-Naur form. Experiments are made with Julius software speech recognition system Julian witch performs words category based recognition. Best results are got trying to recognize single words set into categories. The accuracy rate of recognition reaches 91 %. While trying to recognize uncategorized word sequences – the accuracy rate of recognition reaches only 51 %. The accuracy rate of Microsoft Office Word 2003 control menu recognition reaches 82 %.
|
275 |
Structural priors for multiobject semi-automatic segmentation of three-dimensional medical images via clustering and graph cut algorithmsKéchichian, Razmig 02 July 2013 (has links) (PDF)
We develop a generic Graph Cut-based semiautomatic multiobject image segmentation method principally for use in routine medical applications ranging from tasks involving few objects in 2D images to fairly complex near whole-body 3D image segmentation. The flexible formulation of the method allows its straightforward adaption to a given application.\linebreak In particular, the graph-based vicinity prior model we propose, defined as shortest-path pairwise constraints on the object adjacency graph, can be easily reformulated to account for the spatial relationships between objects in a given problem instance. The segmentation algorithm can be tailored to the runtime requirements of the application and the online storage capacities of the computing platform by an efficient and controllable Voronoi tessellation clustering of the input image which achieves a good balance between cluster compactness and boundary adherence criteria. Qualitative and quantitative comprehensive evaluation and comparison with the standard Potts model confirm that the vicinity prior model brings significant improvements in the correct segmentation of distinct objects of identical intensity, the accurate placement of object boundaries and the robustness of segmentation with respect to clustering resolution. Comparative evaluation of the clustering method with competing ones confirms its benefits in terms of runtime and quality of produced partitions. Importantly, compared to voxel segmentation, the clustering step improves both overall runtime and memory footprint of the segmentation process up to an order of magnitude virtually without compromising the segmentation quality.
|
276 |
A Markovian approach to distributional semanticsGrave, Edouard 20 January 2014 (has links) (PDF)
This thesis, which is organized in two independent parts, presents work on distributional semantics and on variable selection. In the first part, we introduce a new method for learning good word representations using large quantities of unlabeled sentences. The method is based on a probabilistic model of sentence, using a hidden Markov model and a syntactic dependency tree. The latent variables, which correspond to the nodes of the dependency tree, aim at capturing the meanings of the words. We develop an efficient algorithm to perform inference and learning in those models, based on online EM and approximate message passing. We then evaluate our models on intrinsic tasks such as predicting human similarity judgements or word categorization, and on two extrinsic tasks: named entity recognition and supersense tagging. In the second part, we introduce, in the context of linear models, a new penalty function to perform variable selection in the case of highly correlated predictors. This penalty, called the trace Lasso, uses the trace norm of the selected predictors, which is a convex surrogate of their rank, as the criterion of model complexity. The trace Lasso interpolates between the $\ell_1$-norm and $\ell_2$-norm. In particular, it is equal to the $\ell_1$-norm if all predictors are orthogonal and to the $\ell_2$-norm if all predictors are equal. We propose two algorithms to compute the solution of least-squares regression regularized by the trace Lasso, and perform experiments on synthetic datasets to illustrate the behavior of the trace Lasso.
|
277 |
Speech Enhancement Using Nonnegative MatrixFactorization and Hidden Markov ModelsMohammadiha, Nasser January 2013 (has links)
Reducing interference noise in a noisy speech recording has been a challenging task for many years yet has a variety of applications, for example, in handsfree mobile communications, in speech recognition, and in hearing aids. Traditional single-channel noise reduction schemes, such as Wiener filtering, do not work satisfactorily in the presence of non-stationary background noise. Alternatively, supervised approaches, where the noise type is known in advance, lead to higher-quality enhanced speech signals. This dissertation proposes supervised and unsupervised single-channel noise reduction algorithms. We consider two classes of methods for this purpose: approaches based on nonnegative matrix factorization (NMF) and methods based on hidden Markov models (HMM). The contributions of this dissertation can be divided into three main (overlapping) parts. First, we propose NMF-based enhancement approaches that use temporal dependencies of the speech signals. In a standard NMF, the important temporal correlations between consecutive short-time frames are ignored. We propose both continuous and discrete state-space nonnegative dynamical models. These approaches are used to describe the dynamics of the NMF coefficients or activations. We derive optimal minimum mean squared error (MMSE) or linear MMSE estimates of the speech signal using the probabilistic formulations of NMF. Our experiments show that using temporal dynamics in the NMF-based denoising systems improves the performance greatly. Additionally, this dissertation proposes an approach to learn the noise basis matrix online from the noisy observations. This relaxes the assumption of an a-priori specified noise type and enables us to use the NMF-based denoising method in an unsupervised manner. Our experiments show that the proposed approach with online noise basis learning considerably outperforms state-of-the-art methods in different noise conditions. Second, this thesis proposes two methods for NMF-based separation of sources with similar dictionaries. We suggest a nonnegative HMM (NHMM) for babble noise that is derived from a speech HMM. In this approach, speech and babble signals share the same basis vectors, whereas the activation of the basis vectors are different for the two signals over time. We derive an MMSE estimator for the clean speech signal using the proposed NHMM. The objective evaluations and performed subjective listening test show that the proposed babble model and the final noise reduction algorithm outperform the conventional methods noticeably. Moreover, the dissertation proposes another solution to separate a desired source from a mixture with arbitrarily low artifacts. Third, an HMM-based algorithm to enhance the speech spectra using super-Gaussian priors is proposed. Our experiments show that speech discrete Fourier transform (DFT) coefficients have super-Gaussian rather than Gaussian distributions even if we limit the speech data to come from a specific phoneme. We derive a new MMSE estimator for the speech spectra that uses super-Gaussian priors. The results of our evaluations using the developed noise reduction algorithm support the super-Gaussianity hypothesis. / <p>QC 20130916</p>
|
278 |
A new approach in survival analysis with longitudinal covariatesPavlov, Andrey 27 April 2010 (has links)
In this study we look at the problem of analysing survival data in the presence of
longitudinally collected covariates. New methodology for analysing such data has
been developed through the use of hidden Markov modeling. Special attention has
been given to the case of large information volume, where a preliminary data reduction
is necessary. Novel graphical diagnostics have been proposed to assess goodness of fit
and significance of covariates.
The methodology developed has been applied to the data collected on behaviors
of Mexican fruit flies, which were monitored throughout their lives. It has been found
that certain patterns in eating behavior may serve as an aging marker. In particular it
has been established that the frequency of eating is positively correlated with survival
times. / Thesis (Ph.D, Mathematics & Statistics) -- Queen's University, 2010-04-26 18:34:01.131
|
279 |
Optimal Control and Estimation of Stochastic Systems with Costly Partial InformationKim, Michael J. 31 August 2012 (has links)
Stochastic control problems that arise in sequential decision making applications typically assume that information used for decision-making is obtained according to a predetermined sampling schedule. In many real applications however, there is a high sampling cost associated with collecting such data. It is therefore of equal importance to determine when information should be collected as it is to decide how this information should be utilized for optimal decision-making. This type of joint optimization has been a long-standing problem in the operations research literature, and very few results regarding the structure of the optimal sampling and control policy have been published. In this thesis, the joint optimization of sampling and control is studied in the context of maintenance optimization. New theoretical results characterizing the structure of the optimal policy are established, which have practical interpretation and give new insight into the value of condition-based maintenance programs in life-cycle asset management. Applications in other areas such as healthcare decision-making and statistical process control are discussed. Statistical parameter estimation results are also developed with illustrative real-world numerical examples.
|
280 |
Optimal Control and Estimation of Stochastic Systems with Costly Partial InformationKim, Michael J. 31 August 2012 (has links)
Stochastic control problems that arise in sequential decision making applications typically assume that information used for decision-making is obtained according to a predetermined sampling schedule. In many real applications however, there is a high sampling cost associated with collecting such data. It is therefore of equal importance to determine when information should be collected as it is to decide how this information should be utilized for optimal decision-making. This type of joint optimization has been a long-standing problem in the operations research literature, and very few results regarding the structure of the optimal sampling and control policy have been published. In this thesis, the joint optimization of sampling and control is studied in the context of maintenance optimization. New theoretical results characterizing the structure of the optimal policy are established, which have practical interpretation and give new insight into the value of condition-based maintenance programs in life-cycle asset management. Applications in other areas such as healthcare decision-making and statistical process control are discussed. Statistical parameter estimation results are also developed with illustrative real-world numerical examples.
|
Page generated in 0.0551 seconds