Global ETD Search

1	Programinė įranga kompiuterio valdymui balsu / Software for computer control by voice Ringelienė, Živilė 24 September 2008 (has links) Magistro darbe pristatoma sukurta programa, realizuojanti interneto naršyklės valdymą balsu. Ši programa papildo atskirų žodžių prototipinę atpažinimo sistemą, pagrįstą paslėptaisiais Markovo modeliais (PMM). Šios dvi dalys ir sudaro interneto naršyklės valdymo balsu prototipą, kuris gali atpažinti 71 komandą (vienas arba du žodžiai) lietuvių kalba: 1 komandą, skirtą naršyklės atvėrimui, 54 naršyklės valdymo komandas, 16 komandų, atveriančių konkrečius iš anksto sistemai nurodytus tinklalapius. Darbe aprašytas lietuvių kalbos atskirų žodžių atpažinimo sistemos akustinių modelių, grįstų paslėptaisiais Markovo modeliais, rinkinių eksperimentinis tyrimas. Atsižvelgiant į įvairius atpažinimui turinčius įtakos veiksnius (mokymo duomenų kiekį, mišinio komponenčių skaičių, kalbėtojo lytį, skirtingos techninės įrangos naudojimą atpažinime), buvo sukurti skirtingi balso komandų akustinių modelių rinkiniai. Eksperimentinio tyrimo metu buvo tiriama šių rinkinių panaudojimo atpažinimo sistemoje įtaka sistemos atpažinimo tikslumui. Eksperimentinio tyrimo rezultatai parodė, kad interneto naršyklės valdymo balsu sistemos prototipo atpažinimo tikslumas siekia 98%. Sistema gali būti naudojama kaip vaizdinė priemonė vyresniųjų klasių moksleiviams informacinių technologijų, fizikos, psichologijos, matematikos pamokose. / The thesis presents a prototype of the software (system) for Web browser control by voice. The prototype consists of two parts: the Hidden Markov Models based word recognition system and the program, which implements browser control by voice commands and is integrated in the word recognition system. The prototype is a speaker-independent Lithuanian word (voice commands) recognition system and can recognize 71 voice commands: 1 command is intended to run browser, 54 commands – for browser control, and 16 commands – to open various user predefined websites. Taking into account various factors (amount of training data, number of Gaussian mixture components, gender of speaker, use of different hardware for recognition) which have impact on recognition, different sets of acoustic models of Lithuanian voice commands were created and trained. An experimental investigation of the influence of the sets usage in Lithuanian word recognition system on the word recognition accuracy was performed. The results of the experimental investigation showed that created prototype system achieves 98% word recognition accuracy. The prototype system can be used at secondary school as a visual speech recognition learning tool in the informatics, physics, psychology, and mathematics lessons for the pupils of senior classes. Informatics Engineering Kalbos atpažinimas Paslėptieji Markovo modeliai Valdymas balsu Kalbos technologijos Speech recognition Hidden Markov models Voice control Speech technologies
2	Odhad přesnosti řečových technologií na základě měření signálové kvality a obsahové bohatosti audia / Estimation of accuracy of speech technologies based on signal quality and audio content richness Nezval, Jiří January 2020 (has links) This thesis discusses theoretical analysis of the origin of speech, introduces applications of speech technologies and explains the contemporary approach to phonetical transcription of speech recordings. Furthermore, it describes the metrics of audio recordings quality assessment, which is split into two discrete classes. The first one groups signal quality metrics, while the other one groups content richness metrics. The first goal of the practical section is to create a statistical model for accuracy prediction of machine transcription of speech recordings based on a measurement of their quality. The second goal is to evaluate which partial metrics are the most essential for accuracy prediction of machine transcription.
3	Vizualizace výstupu z řečových technologií pro potřeby kontaktních center / Vizualization of Outputs from Speech Technologies for Contact Centers Zhezhela, Oleksandr January 2014 (has links) The thesis is aimed on visualisation of data mined by speech processing technologies. Some methods speech data extraction were studied and technologies for this task were analysed. The variety of meta data that can be mined from speech was defined. Were also examined existing standards and processes of call centres. Some requirements for the user interface were gathered and analysed. On that basis and after communication with call centre employees there was defined and implemented a concept for speech data visualization. Gained solutions were integrated into Speech Analytics Server (SPAS).

1

Page generated in 0.0439 seconds