• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • 1
  • Tagged with
  • 5
  • 5
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Multilingual Articulatory Features for Speech Recognition

Ore, Brian M. 18 April 2007 (has links)
No description available.
2

Cross-lingual automatic speech recognition using tandem features

Lal, Partha January 2011 (has links)
Automatic speech recognition requires many hours of transcribed speech recordings in order for an acoustic model to be effectively trained. However, recording speech corpora is time-consuming and expensive, so such quantities of data exist only for a handful of languages — there are many languages for which little or no data exist. Given that there are acoustic similarities between different languages, it may be fruitful to use data from a well-supported source language for the task of training a recogniser in a target language with little training data. Since most languages do not share a common phonetic inventory, we propose an indirect way of transferring information from a source language model to a target language model. Tandem features, in which class-posteriors from a separate classifier are decorrelated and appended to conventional acoustic features, are used to do that. They have the advantage that the language used to train the classifier, typically a Multilayer Perceptron (MLP) need not be the same as the target language being recognised. Consistent with prior work, positive results are achieved for monolingual systems in a number of different languages. Furthermore, improvements are also shown for the cross-lingual case, in which the tandem features were generated using a classifier not trained for the target language. We examine factors which may predict the relative improvements brought about by tandem features for a given source and target pair. We examine some cross-corpus normalization issues that naturally arise in multilingual speech recognition and validate our solution in terms of recognition accuracy and a mutual information measure. The tandem classifier in work up to this point in the thesis has been a phoneme classifier. Articulatory features (AFs), represented here as a multi-stream, discrete, multivalued labelling of speech, can be used as an alternative task. The motivation for this is that since AFs are a set of physically grounded categories that are not language-specific they may be more suitable for cross-lingual transfer. Then, using either phoneme or AF classification as our MLP task, we look at training the MLP using data from more than one language — again we hypothesise that AF tandem will resulting greater improvements in accuracy. We also examine performance where only limited amounts of target language data are available, and see how our various tandem systems perform under those conditions.
3

Articulation modelling of vowels in dysarthric and non-dysarthric speech

Albalkhi, Rahaf 25 May 2020 (has links)
People with motor function disorders that cause dysarthric speech find difficulty using state-of- the-art automatic speech recognition (ASR) systems. These systems are developed based on non- dysarthric speech models, which explains the poor performance when used by individuals with dysarthria. Thus, a solution is needed to compensate for the poor performance of these systems. This thesis examines the possibility of quantifying vowels of dysarthric and non-dysarthric speech into codewords regardless of inter-speaker variability and possible to be implemented on limited- processing-capability machines. I show that it is possible to model all possible vowels and vowel- like sounds that a North American speaker can produce if the frequencies of the first and second formants are used to encode these sounds. The proposed solution is aligned with the use of neural networks and hidden Markov models to build an acoustic model in conventional ASR systems. A secondary finding of this study includes the feasibility of reducing the set of ten most common vowels in North American English to eight vowels only. / Graduate / 2021-05-11
4

Parole, langues et disfluences : une étude linguistique et phonétique du bégaiement / Speech, languages and disfluencies : a linguistic and phonetic study of stuttering

Didirkova, Ivana 24 November 2016 (has links)
Le bégaiement est un trouble de la fluence de la parole qui se caractérise, entre autres, par une présence accrue d’accidents de parole venant entraver l’intelligibilité de l’énoncé. Ce travail de doctorat a pour objectif d’étudier les disfluences catégorisées comme pathologiques produites par des locuteurs qui bégaient et ce, en tâche de lecture et en situation de parole spontanée. Plus précisément, il s’agit, d’une part, de vérifier si des éléments morphologiques et phonétiques peuvent expliquer l’apparition d’un bégayage et, d’autre part, d’observer les événements articulatoires présents avant et pendant les disfluences. Pour mener à bien les études ayant trait aux éléments linguistiques posant le plus de difficultés aux personnes qui bégaient, 10 locuteurs francophones et 10 locuteurs slovacophones, tous atteints de ce trouble, ont été enregistrés en train de lire un texte et de parler spontanément dans leur langue maternelle. Quant aux travaux portant sur les événements moteurs se déroulant avant et durant les disfluences, ils ont été réalisés grâce à des données EMA acquises auprès de 4 locuteurs francophones (2 locuteurs qui bégaient et 2 sujets normo-fluents) en tâche de lecture. Nos résultats ont montré que les consonnes non-voisées et les occlusives faisaient partie des éléments les plus problématiques à prononcer pour les personnes bègues. L’étude morphologique a révélé que plus un mot contient de morphèmes et plus le risque de voir apparaître une disfluence est accru. Ce résultat doit notamment être mis en corrélation avec le nombre de syllabes présentes dans le mot. En ce qui concerne le second couple d’études, portant sur le niveau moteur de la parole bègue, nos données montrent, en particulier, des similitudes dans les événements articulatoires se déroulant au niveau supra-glottique entre les disfluences perçues acoustiquement comme des blocages et des prolongations. Enfin, une perturbation des gestes coarticulatoires a pu être relevée lors de la production de certaines disfluences. / Stuttering is a speech fluency disorder. It can be mainly characterized by an increased presence of disfluencies that affect the speech intelligibility. The aim of this thesis is to study stuttering-like disfluencies (SLDs) produced by persons who stutter (PWS) during reading tasks and during spontaneous speech. More specifically, we propose, as our first objective, to verify if any morphological or phonetic elements can explain the presence of these disfluencies. Our second objective is to observe articulatory events before and during SLDs. For the studies dealing with the linguistic and phonetic elements that can be problematic to PWS, 10 French-speaking and 10 Slovak-speaking PWS were recorded while reading a text and while having a conversation in their mother tongue. The studies on speech motor events taking place before and during SLDs were realized by means of an EMA. 4 French-speaking subjects participated in this part of the study (2 PWS and 2 control subjects). Our results show that non-voiced consonants and stops were part of the most problematic elements to produce for PWS. The morphological study reveals that the risk of a SLD appearance was higher when the word contained more morphemes. This result should be correlated to the number of syllables that constitute the word. As for the second couple of studies, they focus on the speech motor events in stuttered speech. Our data show that similar articulatory events can take place in the supraglottic cavity during disfluencies perceived as blocks or prolongations. Furthermore, a disruption of coarticulatory gestures was observed in certain disfluencies.
5

Discriminative Articulatory Feature-based Pronunciation Models with Application to Spoken Term Detection

Prabhavalkar, Rohit Prakash 27 September 2013 (has links)
No description available.

Page generated in 0.0895 seconds