Global ETD Search

1	Accounting for Individual Speaker Properties in Automatic Speech Recognition Elenius, Daniel January 2010 (has links) <p>In this work, speaker characteristic modeling has been applied in the fields of automatic speech recognition (ASR) and automatic speaker verification (ASV). In ASR, a key problem is that acoustic mismatch between training and test conditions degrade classification per- formance. In this work, a child exemplifies a speaker not represented in training data and methods to reduce the spectral mismatch are devised and evaluated. To reduce the acoustic mismatch, predictive modeling based on spectral speech transformation is applied. Follow- ing this approach, a model suitable for a target speaker, not well represented in the training data, is estimated and synthesized by applying vocal tract predictive modeling (VTPM). In this thesis, the traditional static modeling on the utterance level is extended to dynamic modeling. This is accomplished by operating also on sub-utterance units, such as phonemes, phone-realizations, sub-phone realizations and sound frames.</p><p>Initial experiments shows that adaptation of an acoustic model trained on adult speech significantly reduced the word error rate of ASR for children, but not to the level of a model trained on children’s speech. Multi-speaker-group training provided an acoustic model that performed recognition for both adults and children within the same model at almost the same accuracy as speaker-group dedicated models, with no added model complexity. In the analysis of the cause of errors, body height of the child was shown to be correlated to word error rate.</p><p>A further result is that the computationally demanding iterative recognition process in standard VTLN can be replaced by synthetically extending the vocal tract length distribution in the training data. A multi-warp model is trained on the extended data and recognition is performed in a single pass. The accuracy is similar to that of the standard technique.</p><p>A concluding experiment in ASR shows that the word error rate can be reduced by ex- tending a static vocal tract length compensation parameter into a temporal parameter track. A key component to reach this improvement was provided by a novel joint two-level opti- mization process. In the process, the track was determined as a composition of a static and a dynamic component, which were simultaneously optimized on the utterance and sub- utterance level respectively. This had the principal advantage of limiting the modulation am- plitude of the track to what is realistic for an individual speaker. The recognition error rate was reduced by 10% relative compared with that of a standard utterance-specific estimation technique.</p><p>The techniques devised and evaluated can also be applied to other speaker characteristic properties, which exhibit a dynamic nature.</p><p>An excursion into ASV led to the proposal of a statistical speaker population model. The model represents an alternative approach for determining the reject/accept threshold in an ASV system instead of the commonly used direct estimation on a set of client and impos- tor utterances. This is especially valuable in applications where a low false reject or false ac- cept rate is required. In these cases, the number of errors is often too few to estimate a reli- able threshold using the direct method. The results are encouraging but need to be verified on a larger database.</p> / Pf-Star / KOBRA MAP MLLR VTLN speaker characteristics dynamic modeling child Information and language technology Informations- och språkteknologi
2	Större chans att klara det? : En specialpedagogisk studie av 10 ungdomars syn på hur datorstöd har påverkat deras språk, lärande och skolsituation. Hansson, Britt January 2008 (has links) <p>I studien intervjuades 10 ungdomar om sina erfarenheter av att använda dator med talsyntes och inspelade böcker. De tillfrågades om i vilka situationer verktygen har kommit till nytta eller upplevts hämmande i deras lärande och skolsituation. På grund av stora skolsvårigheter har ungdomarna fått låna en bärbar dator av skolan. Den har de använt både hemma och i skolan. Tillsammans med föräldrar och lärare har de fått handledning vid kommunens Skoldatatek. Att språket utvecklas när det används har varit utgångspunkt i studien, ur ett sociokulturellt perspektiv. Skolan ska erbjuda en tidsenlig utbildning och elever i skolsvårigheter har rätt att få stöd. Hur detta stöd ska utformas kan skapa ett dilemma på den enskilda skolan. Ett stöd riktat direkt till den enskilde kan nämligen uppfattas som att skolsvårigheter ses som en elevburen problematik, vilket inte får förekomma i ”en skola för alla”. Med tanke på detta dilemma var det viktigt att efterforska ungdomarnas upplevelser av stöd, utveckling och hinder, för att förstå om de orsakar utpekande och exkludering. Resultatet visade att ungdomarna upplevde att de kände sig mer motiverade med sina datorverktyg, som har kompenserat deras svårigheter och tilltalat deras olika lärstilar. Ungdomarna sade sig ha blivit säkrare skribenter och läsare tack vare ökat språkbruk. I deras berättelse framgår även nödvändigheten av stöd från lärare och föräldrar. Resultatet pekar på att alternativa verktyg i lärandet skulle kunna medverka till större måluppfyllelse i en skola för alla, med pedagogisk mångfald.</p> datorstöd specialpedagogik skoldatatek alternativa verktyg datoranvändning kompensation Information and language technology Informations- och språkteknologi
3	Automatic speaker verification on site and by telephone: methods, applications and assessment Melin, Håkan January 2006 (has links) Speaker verification is the biometric task of authenticating a claimed identity by means of analyzing a spoken sample of the claimant's voice. The present thesis deals with various topics related to automatic speaker verification (ASV) in the context of its commercial applications, characterized by co-operative users, user-friendly interfaces, and requirements for small amounts of enrollment and test data. A text-dependent system based on hidden Markov models (HMM) was developed and used to conduct experiments, including a comparison between visual and aural strategies for prompting claimants for randomized digit strings. It was found that aural prompts lead to more errors in spoken responses and that visually prompted utterances performed marginally better in ASV, given that enrollment data were visually prompted. High-resolution flooring techniques were proposed for variance estimation in the HMMs, but results showed no improvement over the standard method of using target-independent variances copied from a background model. These experiments were performed on Gandalf, a Swedish speaker verification telephone corpus with 86 client speakers. A complete on-site application (PER), a physical access control system securing a gate in a reverberant stairway, was implemented based on a combination of the HMM and a Gaussian mixture model based system. Users were authenticated by saying their proper name and a visually prompted, random sequence of digits after having enrolled by speaking ten utterances of the same type. An evaluation was conducted with 54 out of 56 clients who succeeded to enroll. Semi-dedicated impostor attempts were also collected. An equal error rate (EER) of 2.4% was found for this system based on a single attempt per session and after retraining the system on PER-specific development data. On parallel telephone data collected using a telephone version of PER, 3.5% EER was found with landline and around 5% with mobile telephones. Impostor attempts in this case were same-handset attempts. Results also indicate that the distribution of false reject and false accept rates over target speakers are well described by beta distributions. A state-of-the-art commercial system was also tested on PER data with similar performance as the baseline research system. / QC 20100910 speaker recognition speaker verification speech technology biometrics access control speech corpus variance estimation Information and language technology Informations- och språkteknologi
4	Fria och öppna programvaror inom kommunal verksamhet : Vägen mot öppna standarder? / Free- and open source software in municipalities : The way towards open standards? Hanson, Malin, Larsson, Mikael January 2009 (has links) <p>This report deals with the attitudes within municipalities of open source software and open standards and if open source software may be an option to gain open standards. The aim has been to find out if open source software and open standards would be able to solve the lock-in problems that municipalities have against proprietary software. The study is conducted as an exploratory, inductive and qualitative study with depth interviews of subjectively selected informants as data collection method. A literature review has also been implemented by the relevant books and articles. Some economic determinants of municipalities to make use of open source software have not been considered in this study. The informants used in this study are all IT managers in a Swedish municipality and our key informants have been selected in a subjective manner based on the expertise they have in the subject. The conclusions drawn were that municipalities have been difficult to define standards and open standards, and that they do not automatically see the connection between open standards and open software. They also see different areas of interest for standardization.</p> / <p>Denna rapport tar upp kommuners inställning till öppna program och öppna standarder och om öppen programvara kan vara ett alternativ för att få öppna standarder. Syftet har varit att ta reda på om öppna program och öppna standarder skulle kunna lösa de problem som kommuner har med inlåsningar mot proprietär programvara. Studien är genomförd som en explorativ, induktiv och kvalitativ studie med djupintervju av subjektivt utvalda informanter som datainsamlingsmetod. En litteraturgranskning har också genomförts av relevanta böcker och artiklar. Några ekonomiska faktorer för kommuner att använda sig av öppen programvara har inte beaktats i denna studie. De informanter som använts i denna studie är alla ITchefer inom någon svensk kommun och nyckelinformanterna har valts ut på ett subjektivt sätt utifrån den expertkunskap de besitter inom ämnet. Slutsatserna som drogs var att kommuner har svårt att definiera standarder och öppna standarder, och att de inte med automatik ser kopplingen mellan öppna standarder och öppen programvara. De ser också olika områden som intressanta för en standardisering.</p> Standard open standard open Software open Source lock-in problems Standard öppen standard öppen programvara öppen källkod inlåsningsproblem Information and language technology Informations- och språkteknologi

1

Page generated in 0.1361 seconds