Spelling suggestions: "subject:"speechrecognition"" "subject:"breedsrecognition""
211 |
Efficient methods for rapid UBM training (RUT) for robust speaker verification /Chandrasekaran, Aravind, January 2008 (has links)
Thesis (M.S.)--University of Texas at Dallas, 2008. / Includes vita. Includes bibliographical references (leaves 36-37)
|
212 |
Automatic speech recognition for electronic warfare verbal reports /Moore, D. W. January 1994 (has links)
Report (M.S.)--Virginia Polytechnic Institute and State University, 1994. / Abstract. Includes bibliographical references (leaves 71-73). Also available via the Internet.
|
213 |
Αναγνώριση συνεχούς ομιλίας σε περιβάλλον θορύβου με χρήση πολλών μικροφώνωνΝόκας, Γεώργιος Ν. 19 July 2010 (has links)
- / -
|
214 |
Structured deep neural networks for speech recognitionWu, Chunyang January 2018 (has links)
Deep neural networks (DNNs) and deep learning approaches yield state-of-the-art performance in a range of machine learning tasks, including automatic speech recognition. The multi-layer transformations and activation functions in DNNs, or related network variations, allow complex and difficult data to be well modelled. However, the highly distributed representations associated with these models make it hard to interpret the parameters. The whole neural network is commonly treated a ``black box''. The behaviours of activation functions and the meanings of network parameters are rarely controlled in the standard DNN training. Though a sensible performance can be achieved, the lack of interpretations to network structures and parameters causes better regularisation and adaptation on DNN models challenging. In regularisation, parameters have to be regularised universally and indiscriminately. For instance, the widely used L2 regularisation encourages all parameters to be zeros. In adaptation, it requires to re-estimate a large number of independent parameters. Adaptation schemes in this framework cannot be effectively performed when there are limited adaptation data. This thesis investigates structured deep neural networks. Special structures are explicitly designed, and they are imposed with desired interpretation to improve DNN regularisation and adaptation. For regularisation, parameters can be separately regularised based on their functions. For adaptation, parameters can be adapted in groups or partially adapted according to their roles in the network topology. Three forms of structured DNNs are proposed in this thesis. The contributions of these models are presented as follows. The first contribution of this thesis is the multi-basis adaptive neural network. This form of structured DNN introduces a set of parallel sub-networks with restricted connections. The design of restricted connectivity allows different aspects of data to be explicitly learned. Sub-network outputs are then combined, and this combination module is used as the speaker-dependent structure that can be robustly estimated for adaptation. The second contribution of this thesis is the stimulated deep neural network. This form of structured DNN relates and smooths activation functions in regions of the network. It aids the visualisation and interpretation of DNN models but also has the potential to reduce over-fitting. Novel adaptation schemes can be performed on it, taking advantages of the smooth property that the stimulated DNN offer. The third contribution of this thesis is the deep activation mixture model. Also, this form of structured DNN encourages the outputs of activation functions to achieve a smooth surface. The output of one hidden layer is explicitly modelled as the sum of a mixture model and a residual model. The mixture model forms an activation contour, and the residual model depicts fluctuations around this contour. The smoothness yielded by a mixture model helps to regularise the overall model and allows novel adaptation schemes.
|
215 |
An investigation into the practical implementation of speech recognition for data capturingVan der Walt, Craig January 1993 (has links)
Thesis (Master Diploma (Technology))--Cape Technikon, Cape Town,1993 / A study into the practical implementation of Speech Recognition for the purposes of
Data Capturing within Telkom SA. is described. As datacapturing is increasing in
demand a more efficient method of capturing is sought. The technology relating to
Speech recognition is herein examined and practical gnidelines for selecting a Speech
recognition system are described. These guidelines are used to show how
commercially available systems can be evaluated. Specific tests on a selected speech
recognition system are described, relating to the accuracy and adaptability of the
system. The results obtained illustrate why at present speech recognition systems
are not advisable for the purpose of Data capturing. The results also demonstrate
how the selection of keywords words can affect system performance. Areas of
further research are highlighted relating to recognition performance and vocabulary
selection.
|
216 |
Bayesian and predictive techniques for speaker adaptationAhadi-Sarkani, Seyed Mohammad January 1996 (has links)
No description available.
|
217 |
Hardware implementation of an automatic speaker recognition system using artificial neural networksMoonasar, Viresh January 2002 (has links)
Submitted in fulfillment of the academic requirements for the degree of Master of Technology in Electrical Engineering in the Department of Electronic Engineering, Faculty of Engineering, ML Sultan Technikon of Durban in South Africa, March 2002. / The use of speaker recognition technology in interactive voice response and electronic commerce systems has been limited. This is due to the lack of research attention and published results when compared to all the other areas of speech recognition technologies / M
|
218 |
Development of tests and preprocessing algorithms for evaluation and improvement of speech recognition unitsWasmeier, Hans January 1986 (has links)
This study considered the evaluation of commercially available isolated word, speaker dependent, speech recognition units, and preprocessing techniques that may be used for improving their performance. The problem was considered in three separate stages.
A series of tests were designed to exercise an isolated word, speaker dependent, speech recognition unit. These tests provided a sound basis for determining a given unit's strengths and weaknesses. This knowledge permits a more informed decision on the best recognition device for a given price range. As well, this knowledge may be used in the design of a robust vocabulary, and creation of guidelines for best performance. The test vocabularies were based on the forty English phonemes identified by Rabiner and Schafer [28] and the test variations were representative of common variations which may be expected in normal use.
A digital archive system was implemented for storing the voice input of test subjects. This facility provided a data base for an investigation of preprocessing techniques. As well, it permits the testing of different speech recognition units with the same voice input, providing a platform for device comparison.
Several speech preprocessing and performance improvement techniques were then investigated. Specifically, two types of time normalization, the enhancement of low energy phonemes and a change in training technique were investigated. These techniques permit a more accurate analysis of the failure mechanism of the speech recognition unit. They may also provide the basis for a speech preprocessor design which could be placed in front of a commercial speech recognition unit.
A commercially available speech recognition unit, the NEC SR100, was used as a measure of the effectiveness of the tests and of the improvements. Results of the study indicated that the designed tests and the preprocessing & performance improvement techniques investigated were useful in identifying the speech recognition unit's weaknesses. Also, depending on the economics of implementation, it was found that preprocessing may provide a cost effective solution to some of the recognition unit's shortcomings. / Applied Science, Faculty of / Electrical and Computer Engineering, Department of / Graduate
|
219 |
Speaker-independent access to a large lexiconMathan, Luc Stefan January 1987 (has links)
No description available.
|
220 |
Integration of multiple feature sets for reducing ambiguity in automatic speech recognitionMomayyezSiahkal, Parya. January 2008 (has links)
No description available.
|
Page generated in 0.1066 seconds