Global ETD Search

1	The Continuous Speech Recognition System Base on Hidden Markov Models with One-Stage Dynamic Programming Algorithm. Hsieh, Fang-Yi 03 July 2003 (has links) Based on Hidden Markov Models (HMM) with One-Stage Dynamic Programming Algorithm, a continuous-speech and speaker-independent Mandarin digit speech recognition system was designed in this work. In order to implement this architecture to fit the performance of hardware, various parameters of speech characteristics were defined to optimize the process. Finally, the ¡§State Duration¡¨ and the ¡§Tone Transition Property Parameter¡¨ were extracted from speech temporal information to improve the recognition rate. Via using the test database, experimental results show that this new ideal of one-stage dynamic programming algorithm , with ¡§state duration¡¨ and ¡§ tone transition property parameter¡¨ , will have 18% recognition rate increase when compare to the conventional one. For speaker-independent and connect-word recognition, this system will achieve recognition rate to 74%. For speaker-independent but isolate-word recognition, it will have recognition rate higher than 96%. Recognition rate of 92% is obtained as this system is applied to the connect-word speaker-dependent recognition. Hidden Markov Models Continuous Speech Recognition One-Stage Dynamic Programming Algorithm
2	Turkish Large Vocabulary Continuous Speech Recognition By Using Limited Audio Corpus Susman, Derya 01 March 2012 (has links) (PDF) Speech recognition in Turkish Language is a challenging problem in several perspectives. Most of the challenges are related to the morphological structure of the language. Since Turkish is an agglutinative language, it is possible to generate many words from a single stem by using suffixes. This characteristic of the language increases the out-of-vocabulary (OOV) words, which degrade the performance of a speech recognizer dramatically. Also, Turkish language allows words to be ordered in a free manner, which makes it difficult to generate robust language models. In this thesis, the existing models and approaches which address the problem of Turkish LVCSR (Large Vocabulary Continuous Speech Recognition) are explored. Different recognition units (words, morphs, stem and endings) are used in generating the n-gram language models. 3-gram and 4-gram language models are generated with respect to the recognition unit. Since the solution domain of speech recognition is involved with machine learning, the performance of the recognizer depends on the sufficiency of the audio data used in acoustic model training. However, it is difficult to obtain rich audio corpora for the Turkish language. In this thesis, existing approaches are used to solve the problem of Turkish LVCSR by using a limited audio corpus. We also proposed several data selection approaches in order to improve the robustness of the acoustic model. QA Computer Software 76.75-76.765
3	A Study On Language Modeling For Turkish Large Vocabulary Continuous Speech Recognition Bayer, Ali Orkan 01 June 2005 (has links) (PDF) This study focuses on large vocabulary Turkish continuous speech recognition. Continuous speech recognition for Turkish cannot be performed accurately because of the agglutinative nature of the language. The agglutinative nature decreases the performance of the classical language models that are used in the area. In this thesis firstly, acoustic models using different parameters are constructed and tested. Then, three types of n-gram language models are built. These involve class-based models, stem-based models, and stem-end-based models. Two pass recognition is performed using the Hidden Markov Modeling Toolkit (HTK) for testing the system first with the bigram models and then with the trigram models. At the end of the study, it is found that trigram models over stems and endings give better results, since their coverage of the vocabulary is better.
4	Automatic Speech Recognition System Continually Improving Based on Subtitled Speech Data / Automatic Speech Recognition System Continually Improving Based on Subtitled Speech Data Kocour, Martin January 2019 (has links) V dnešnej dobe systémy rozpoznávania reči s veľkým slovníkom dosahujú pomerne vysoké presnosti. Za ich výsledkami však často stoja desiatky ba až stovky hodín manuálne oanotovaných trénovacích dát. Takéto dáta sú často bežne nedostupné alebo pre požadovaný jazyk vôbec neexistujú. Možným riešením je použitie bežne dostupných no menej kvalitných audiovizuálnych dát. Táto práca sa zaoberá technikou zpracovania práve takýchto dát a ich použitím pre trénovanie akustických modelov. Ďalej táto práca pojednáva o možnom využití týchto dát pre kontinuálne vylepšovanie modelov, kedže tieto dáta sú prakticky nevyčerpateľné. Pre tieto účely bol v rámci práce navrhnutý nový prístup pre výber dát.
5	Хијерархијско кластеровање модела Гаусових смеша у апликацијама за континуално препознавање говора / Hijerarhijsko klasterovanje modela Gausovih smeša u aplikacijama za kontinualno prepoznavanje govora / Hierarchical Clustering of GaussianMixture Models in Applications forContinuous Speech Recognition Popović Branislav 17 July 2012 (has links) <p>У оквиру докторске дисертације<br />представљен је нови алгоритам<br />хијерархијског кластеровања модела<br />Гаусових смеша, заснован на операцији<br />поделе и спајања. Алгоритам тежи<br />побољшању локално оптималног<br />решења одређеног иницијалном<br />констелацијом. Иницијализује се<br />локално оптималним параметрима,<br />добијеним коришћењем референтног<br />приступа сличног k‐means‐у и тежи ка<br />приближавању глобалном оптимуму<br />функције циља, итеративном<br />применом операција поделе и спајања<br />над кластерима Гаусових компоненти,<br />одређеним применом референтног<br />алгоритма.<br />Додатно побољшање алгоритма<br />осварено је увођењем принципа<br />селекције модела у сврху одређивања<br />најповољнијег односа тачности и<br />рачунске сложености, у задатку <span style="font-size: 12px;">селекције гаусијана унутар реалног</span></p><p>система за препознавање. Предложени<br />метод тестиран је како над вештачки<br />генерисаним подацима, тако и у<br />оквиру алгоритма селекције гаусијана,<br />на примеру система за континуално<br />препознавање говора. У оба случаја<br />забележено је побољшање резултата у<br />односу на резултате остварене<br />применом референтног алгоритма.<br />Побољшања алгоритма селекције<br />гаусијана избором оптималног скупа<br />системских параметара такође су<br />размотрена.</p> / <p>U okviru doktorske disertacije<br />predstavljen je novi algoritam<br />hijerarhijskog klasterovanja modela<br />Gausovih smeša, zasnovan na operaciji<br />podele i spajanja. Algoritam teži<br />poboljšanju lokalno optimalnog<br />rešenja određenog inicijalnom<br />konstelacijom. Inicijalizuje se<br />lokalno optimalnim parametrima,<br />dobijenim korišćenjem referentnog<br />pristupa sličnog k‐means‐u i teži ka<br />približavanju globalnom optimumu<br />funkcije cilja, iterativnom<br />primenom operacija podele i spajanja<br />nad klasterima Gausovih komponenti,<br />određenim primenom referentnog<br />algoritma.<br />Dodatno poboljšanje algoritma<br />osvareno je uvođenjem principa<br />selekcije modela u svrhu određivanja<br />najpovoljnijeg odnosa tačnosti i<br />računske složenosti, u zadatku <span style="font-size: 12px;">selekcije gausijana unutar realnog</span></p><p>sistema za prepoznavanje. Predloženi<br />metod testiran je kako nad veštački<br />generisanim podacima, tako i u<br />okviru algoritma selekcije gausijana,<br />na primeru sistema za kontinualno<br />prepoznavanje govora. U oba slučaja<br />zabeleženo je poboljšanje rezultata u<br />odnosu na rezultate ostvarene<br />primenom referentnog algoritma.<br />Poboljšanja algoritma selekcije<br />gausijana izborom optimalnog skupa<br />sistemskih parametara takođe su<br />razmotrena.</p> / <p>The dissertation presents a novel splitand‐<br />merge algorithm for hierarchical<br />clustering of Gaussian mixture models.<br />The algorithm tends to improve on the<br />local optimal solution determined by the<br />initial constellation. It is initialized by<br />local optimal parameters obtained by<br />using a baseline approach similar to kmeans,<br />and it tends to approach more<br />closely to the global optimum of the<br />target clustering function, by iteratively<br />splitting and merging the clusters of<br />Gaussian components obtained as the<br />output of the baseline algorithm.<br />The algorithm is further improved by<br />introducing model selection in order to<br />obtain the best possible trade‐off<br />between recognition accuracy and<br />computational load in a Gaussian<br />selection task applied within an actual<br />recognition system. The proposed<br />method is tested both on artificial data<br />and in the framework of Gaussian<br />selection performed within a real <span style="font-size: 12px;">continuous speech recognition system. In</span></p><p>both cases an improvement over the<br />baseline method has been observed.<br />Additional improvements of Gaussian<br />selection algorithm by using the optimal<br />set of system parameters are also<br />discussed.</p>
6	Mining of Textual Data from the Web for Speech Recognition / Mining of Textual Data from the Web for Speech Recognition Kubalík, Jakub January 2010 (has links) Prvotním cílem tohoto projektu bylo prostudovat problematiku jazykového modelování pro rozpoznávání řeči a techniky pro získávání textových dat z Webu. Text představuje základní techniky rozpoznávání řeči a detailněji popisuje jazykové modely založené na statistických metodách. Zvláště se práce zabývá kriterii pro vyhodnocení kvality jazykových modelů a systémů pro rozpoznávání řeči. Text dále popisuje modely a techniky dolování dat, zvláště vyhledávání informací. Dále jsou představeny problémy spojené se získávání dat z webu, a v kontrastu s tím je představen vyhledávač Google. Součástí projektu byl návrh a implementace systému pro získávání textu z webu, jehož detailnímu popisu je věnována náležitá pozornost. Nicméně, hlavním cílem práce bylo ověřit, zda data získaná z Webu mohou mít nějaký přínos pro rozpoznávání řeči. Popsané techniky se tak snaží najít optimální způsob, jak data získaná z Webu použít pro zlepšení ukázkových jazykových modelů, ale i modelů nasazených v reálných rozpoznávacích systémech.

1

Page generated in 0.1114 seconds