Global ETD Search

101	Noise robustness in automatic speech recognition / Chen, Chia-Ping. January 2004 (has links) Thesis (Ph. D.)--University of Washington, 2004. / Vita. Includes bibliographical references (p. 112-121).
102	Kernel eigenspace-based MLLR adaptation / Hsiao, Roger Wend Huu. January 2004 (has links) Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2004. / Includes bibliographical references (leaves 77-81). Also available in electronic version. Access restricted to campus users.
103	Robust speech recognition against unknown short-time noise / Chan, Arthur Yu Chung. January 2002 (has links) Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2002. / Includes bibliographical references (leaves 119-125). Also available in electronic version. Access restricted to campus users.
104	A study of some variations on the hidden Markov modelling approach to speaker independent isolated word speech recognition / Leung, Shun Tak Albert. January 1990 (has links) Thesis (M. Phil.)--University of Hong Kong, 1990.
105	Combining acoustic features and articulatory features for speech recognition / Leung, Ka Yee. January 2002 (has links) Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2002. / Includes bibliographical references (leaves 92-96). Also available in electronic version. Access restricted to campus users.
106	Word hypothesis from undifferentiated, errorful phonetic strings / Sellman, R. Thomas. January 1993 (has links) Thesis (M.S.)--Rochester Institute of Technology, 1993. / Typescript. Includes bibliographical references (leaves 81-83).
107	Language modeling for automatic speech recognition in telehealth Zhang, Xiaojia, January 2005 (has links) Thesis (M.S.)--University of Missouri-Columbia, 2005. / The entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file viewed on (January 11, 2007) Vita. Includes bibliographical references.
108	Exploiting high-level knowledge resources for speech recognition with applications to interactive voice response systems / Balakrishna, Mithun. January 2007 (has links) Thesis (Ph.D.)--University of Texas at Dallas, 2007. / Includes vita. Includes bibliographical references (leaves 155-162)
109	A speaker recognition solution for identification and authentication Adamski, Michal Jerzy 26 June 2014 (has links) M.Com. (Informatics) / A certain degree of vulnerability exists in traditional knowledge-based identification and authentication access control, as a result of password interception and social engineering techniques. This vulnerability has warranted the exploration of additional identification and authentication approaches such as physical token-based systems and biometrics. Speaker recognition is one such biometric approach that is currently not widely used due to its inherent technological challenges, as well as a scarcity of comprehensive literature and complete open-source projects. This makes it challenging for anyone who wishes to study, develop and improve upon speaker recognition for identification and authentication. In this dissertation, we condense some of the available speaker recognition literature in a manner that would provide a comprehensive overall picture of speaker identification and authentication to a wider range of interested audiences. A speaker recognition solution in the form of an open, user-friendly software prototype environment is presented, called SRIA (Speaker Recognition Identification Authentication). In SRIA, real users may enrol and perform speaker identification and authentication tasks. SRIA is intended as platform for speaker recognition understanding and further research and development. Automatic speech recognition Biometric identification
110	Statistical models for noise-robust speech recognition van Dalen, Rogier Christiaan January 2011 (has links) A standard way of improving the robustness of speech recognition systems to noise is model compensation. This replaces a speech recogniser's distributions over clean speech by ones over noise-corrupted speech. For each clean speech component, model compensation techniques usually approximate the corrupted speech distribution with a diagonal-covariance Gaussian distribution. This thesis looks into improving on this approximation in two ways: firstly, by estimating full-covariance Gaussian distributions; secondly, by approximating corrupted-speech likelihoods without any parameterised distribution. The first part of this work is about compensating for within-component feature correlations under noise. For this, the covariance matrices of the computed Gaussians should be full instead of diagonal. The estimation of off-diagonal covariance elements turns out to be sensitive to approximations. A popular approximation is the one that state-of-the-art compensation schemes, like VTS compensation, use for dynamic coefficients: the continuous-time approximation. Standard speech recognisers contain both per-time slice, static, coefficients, and dynamic coefficients, which represent signal changes over time, and are normally computed from a window of static coefficients. To remove the need for the continuous-time approximation, this thesis introduces a new technique. It first compensates a distribution over the window of statics, and then applies the same linear projection that extracts dynamic coefficients. It introduces a number of methods that address the correlation changes that occur in noise within this framework. The next problem is decoding speed with full covariances. This thesis re-analyses the previously-introduced predictive linear transformations, and shows how they can model feature correlations at low and tunable computational cost. The second part of this work removes the Gaussian assumption completely. It introduces a sampling method that, given speech and noise distributions and a mismatch function, in the limit calculates the corrupted speech likelihood exactly. For this, it transforms the integral in the likelihood expression, and then applies sequential importance resampling. Though it is too slow to use for recognition, it enables a more fine-grained assessment of compensation techniques, based on the KL divergence to the ideal compensation for one component. The KL divergence proves to predict the word error rate well. This technique also makes it possible to evaluate the impact of approximations that standard compensation schemes make. 620 Speech recognition ; Noise-robustness

Search results