• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 170
  • 40
  • 33
  • 30
  • 14
  • 10
  • 9
  • 8
  • 4
  • 4
  • 4
  • 3
  • 3
  • 2
  • 2
  • Tagged with
  • 391
  • 104
  • 101
  • 86
  • 80
  • 47
  • 39
  • 33
  • 32
  • 31
  • 30
  • 30
  • 28
  • 28
  • 27
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
181

Ukazatele identity mluvčího v oblasti temporálních modulací řečového signálu / Speaker identity indicators in the domain of the temporal modulation of the speech signal

Weingartová, Lenka January 2011 (has links)
AbstractAbstractAbstractAbstract This diploma thesis aims to contribute to the field of speaker recognition in the domain of temporal changes in the speech signal. After a brief introduction into forensic phonetics, it gives an outline of approaches and factors which help or hinder successful recognition. The focus is then shifted to the temporal structure of speech and approaches to its analysis currently in use. The practical section of this thesis consists of an experiment designed to assess the contribution of certain temporal measures to speaker recognition. The variables used here are %V (the proportion of vocalic intervals within a sentence), ΔV and ΔC (the standard deviation of the duration of vocalic/consonantal intervals within a sentence), VarcoV and VarcoC (the previous variables normalised for average interval duration) and the Pairwise Variability Indices, both vocalic and consonantal, raw and normalised. Beside these, another variable is used to capture the local articulation rate and especially final deceleration in the utterances - LAR (the inverse of the distance between successive midpoints of the vocalic intervals). Whereas the first mentioned variables are not very successful in distinguishing the speakers, LAR seems very well suited for capturing speaker idiosyncrasies, although...
182

Bénéfices et limites des représentations en facteur de variabilité totale pour la reconnaissance du locuteur / Benefits and limits of the total variability factor representation for speaker recognition

Bousquet, Pierre-Michel 23 May 2014 (has links)
Le domaine de la reconnaissance automatique du locuteur (RAL) recouvre l’ensembledes techniques visant à discriminer des locuteurs à partir de leurs énoncésde voix. Il se classe dans la famille des procédures d’authentification biométrique del’identité. La reconnaissance du locuteur a connu ces dernières années une avancée significativeavec un nouveau concept de représentation de l’énoncé de voix, désignésous le terme de i-vector. Ce type de représentation s’appuie sur le paradigme de modélisationpar mélange de gaussiennes et présente la particularité de se réduire numériquementà un vecteur de dimension faible, au regard des représentations précédentes,et pourtant très discriminant vis à vis du locuteur.Les travaux présentés dans cette thèse s’inscrivent dans ce nouveau contexte. Orientésautour de cette représentation, ils visent à en comprendre et évaluer les hypothèses,les points fondamentaux, le comportement et les limites.Nous avons en premier lieu conduit une analyse statistique sur cette nouvelle représentation.L’étude a porté sur l’effet et l’importance relative des différentes étapes deconstitution et d’exploitation du concept. Cette analyse a permis de mieux comprendreses caractéristiques, mais aussi de faire apparaître des défauts de la représentation quinous ont conduits à mettre en place de nouvelles transformations dans cet espace. L’objectifde ces techniques est de faire converger les données vers des modèles théoriques,à meilleur pouvoir discriminant. Nous recensons et démontrons un certain nombre depropriétés induites par ces transformations, qui justifient leur emploi. En terme de performance,ces techniques réduisent d’un ordre de grandeur de 50% les taux d’erreurdes systèmes basés sur les i-vectors et des postulats gaussiens, permettant notammentd’atteindre par la voie du cadre probabiliste gaussien les meilleurs taux de détectiondans le domaine.Une évaluation générale des composants de la méthode est ensuite détaillée dansce document. Elle met en avant l’importance de certaines étapes, permettant ainsi dedégager, par comparaison à des méthodes alternatives, les approches fondamentalesqui confèrent au concept une valeur de paradigme. Nous montrons la primauté decertaines étapes stratégiques dans la chaîne des traitements, parmi lesquelles les transformationsque nous avons mises en place, et leur relative indépendance aux méthodes et hypothèses adoptées.Des limites de la solution sont mises au jour et exposées dans une étude dite d’anisotropie,qui relativise sa capacité à produire une paramétrisation linéaire globale des variabilitésqui soit optimale.En parallèle de ces investigations, nous avons participé à l’exploration d’un nouveaumodèle alternatif à la solution la plus usuelle de représentation des énoncés devoix. Conçu par J.F. Bonastre, il produit des vecteurs sous forme de clés binaires etfournit les moyens de les comparer, en suivant une voie semi-paramétrique basée surune nouvelle approche de la problématique. Cette exploration a contribué à l’améliorationde ce modèle et à l’ouverture de nouvelles pistes. Elle a été également utile à notreévaluation du concept de i-vector.Les travaux présentés dans ce document contribuent à l’amélioration de ce modèleet à l’ouverture de nouvelles pistes. Ils sont également utiles à notre évaluation duconcept de i-vector.Enfin, quelques aménagements des solutions i-vectors à des cas particuliers ont étémis en place : nous proposons de nouvelles variantes pour gérer la décision sur lesénoncés de courte durée (qui constituent l’un des enjeux actuels du domaine) et sur lesénoncés présentant une divergence a priori (support, durée, langue distincts).L’ensemble de ces travaux vise à mieux circonscrire les pistes de recherche les plusporteuses autour de ce nouveau concept de représentation de la voix humaine / The speaker recognition field covers all the techniques intended to authentify theidentity by using voice utterances. Speaker recognition has experienced in recent yearsa significant step forward with a new concept of representation, referred to as the ivector. This type of representation is based on the Gaussian mixture model paradigmand has the distinguishing feature of being a small size vector compared to previousrepresentations, yet very discriminating towards the speaker.The works presented in this thesis are within that new context. Focused on thisrepresentation, they aim to better understand it and assess its assumptions, highlightits key points, its behaviors and limits.We first carried out a statistical analysis of this new representation. This analysishelped to better understand its characteristics, but also reveal defaults of the representationthat led us to develop new transformations. The goal of these techniques is tomove data towards a theoretical model, having a better accuracy for discrimination.We identify and demonstrate a number of properties of these transformations whichjustify their relevance. In terms of performance, applying these techniques reduce byan order of magnitude of 50% the error rate of systems based on i-vectors and Gaussianassumptions and yield the best detection rate in the field through the Gaussianprobabilistic framework. A complete evaluation of the system components is detailed later in this document.By comparing the fundamental approaches to alternative methods, this evaluationidentifies and highlights the fundamental steps that give the concept a value ofparadigm.We show the primacy of some strategic steps in the process chain, includingour propositions, and their relative independence from methods and assumptions.Limits of the solution are uncovered and exposed in a study of "anisotropy", whichreveals some lack of compliance of i-vector distributions with Gaussian assumptions.Alongside these investigations, we participated in the exploration of a new model,alternative to the most usual statistical representations of utterances, which relies on asemi- parametric representation. Designed by J.F. Bonastre, it produces binary key vectorsand provides the means to compare them. This exploration has contributed to the improvement of this model and opens new gates. It was also helpful to our evaluationof the concept of i -vector.Some adaptations of i-vector approach to special speaker recognition tasks are described: we propose new variants to handle short duration utterances ( which is oneof the current issues in the field ) and to deal with a priori mismatch (for example ofsupport, time or distinct language).We hope that this work will better highlight some of the most promising slopes ofresearch around this new concept of representation for speaker recognition
183

Pronunciation support for Arabic learners

Alsabaan, Majed Soliman K. January 2015 (has links)
The aim of the thesis is to find out whether providing feedback to Arabic language learners will help them improve their pronunciation, particularly of words involving sounds that are not distinguished in their native languages. In addition, it aims to find out, if possible, what type of feedback will be most helpful. In order to achieve this aim, we developed a computational tool with a number of component sub tools. These tools involve the implementation of several substantial pieces of software. The first task was to ensure the system we were building could distinguish between the more challenging sounds when they were produced by a native speaker, since without that it will not be possible to classify learners’ attempts at these sounds. To this end, a number of experiments were carried out with the hidden Markov model toolkit (the HTK), a well known speech recognition toolkit, in order to ensure that it can distinguish between the confusable sounds, i.e. the ones that people have difficulty with. The developed computational tool analyses the differences between the user’s pronunciation and that of a native speaker by using grammar of minimal pairs, where each utterance is treated as coming from a family of similar words. This provides the ability to categorise learners’ errors - if someone is trying to say cat and the recogniser thinks they have said cad then it is likely that they are voicing the final consonant when it should be unvoiced. Extensive testing shows that the system can reliably distinguish such minimal pairs when they are produced by a native speaker, and that this approach does provide effective diagnostic information about errors. The tool provides feedback in three different sub-tools: as an animation of the vocal tract, as a synthesised version of the target utterance, and as a set of written instructions. The tool was evaluated by placing it in a classroom setting and asking 50 Arabic students to use the different versions of the tool. Each student had a thirty minute session with the tool, working their way through a set of pronunciation exercises at their own pace. The results of this group showed that their pronunciation does improve over the course of the session, though it was not possible to determine whether the improvement is sustained over an extended period. The evaluation was done from three points of view: quantitative analysis, qualitative analysis, and using a questionnaire. Firstly, the quantitative analysis gives raw numbers telling whether a learner had improved their pronunciation or not. Secondly, the qualitative analysis shows a behaviour pattern of what a learner did and how they used the tool. Thirdly, the questionnaire gives feedback from learners and their comments about the tool. We found that providing feedback does appear to help Arabic language learners, but we did not have enough data to see which form of feedback is most helpful. However, we provided an informative analysis of behaviour patterns to see how Arabic students used the tool and interacted with it, which could be useful for more data analysis.
184

The collaborative role of an ESL support teacher in a secondary school : supporting ESL students and content teachers utilizing integrated language and content instruction

Konnert, Michele Rand 05 1900 (has links)
This research project was conducted with social studies and English teachers and ESL students in mainstream classes at a secondary school in Richmond, B.C. over a seven-month period from September 1998 to March 1999. As an action researcher, I solved problems through team work and through following a cyclical process of 1. strategic planning, 2. action, 3. observation, evaluation and self-evaluation, and 4. critical and self-critical reflection on the cycle (McNiff, Lomax, & Whitehead, 1996). The findings included in this study are a definition of the ESL support role, effectiveness of the ESL support program, teacher collaboration, application of the ILC approach and the Knowledge Framework (Mohan, 1986), challenges and issues for content teachers and ESL students, and the dual role as support teacher and researcher. First, with regard to a definition of the ESL support role, ESL support teachers were viewed by myself and the administration as language development specialists who act as consultants, with a focus on co-teaching and individual instruction. Colleagues perceived the ESL support team as ESL trained teachers who must prove their effectiveness through action, rather than words, in content teachers' classrooms. ESL students viewed the ESL support teachers as a welcome support or unwelcome intruders. Second, with regard to the effectiveness of the ESL support program, the administration and I felt that the program provided exceptional support services to content teachers and ESL students. ESL students also felt that the ESL support program was very helpful. Colleagues, however, were initially skeptical of the program, but eventually valued the support. Third, collaboration increased over time as ESL support specialists worked in cooperative relationships with content teachers. Fourth, the ILC approach was selectively, and at times superficially, implemented in content courses. Also, the Knowledge Framework was the most successful teaching method for ESL support of content teachers and ESL students. Fifth, there were many challenges for content teachers, ESL learners, and ESL support specialists. One challenge was the lack of English spoken by our student population. Another concern was the appearance of passivity of ESL students. Also, assessment and evaluation of ESL students was very difficult for content teachers. Thus, content instructors needed to learn alternate assessment and evaluation strategies for their ESL learners. In addition, teachers wondered about their ESL students' comprehension and exam preparation. Lastly, tensions inevitably arose from the dual role as teacher and researcher. / Education, Faculty of / Language and Literacy Education (LLED), Department of / Graduate
185

NOT YET-Constructions in the Swedish Skellefteå Dialect / Inte än-konstruktioner i Skelleftemålet

Zachrisson, Jill January 2020 (has links)
Expressions such as not yet, already, still and no longer belong to a category called Phasal Polarity (Phasal Polarity), and express phase, polarity and speaker expectations. In European languages, these often appear as phasal adverbs. However, in the Skellefteå dialect, spoken in northern Sweden, another type of construction is also used to express not yet. The construction consists of the auxiliary hɶ ‘have’ together with the supine form of the lexical verb prefixed by the negative prefix o-, for example I hɶ oskrive breve ‘I haven’t written the letter yet’. I will refer to this construction as the o-construction. Constructions meaning not yet have lately been referred to as nondum (from Latin nondum 'not yet') (Veselinova & Devos, forthcoming) and appear to be widely used in grammaticalized forms in, for example, Austronesian- and Bantu languages. The o-construction in the Skellefteå dialect is only mentioned but has no detailed documentation in existing descriptions. The aim of this study is to collect data and analyze the use of this construction. Data were collected through interaction with speakers of the Skellefteå dialect, using questionnaires and direct elicitation. The results show that the o-construction occurs in the dialect to express NOT YET, but only in specific contexts, where certain conditions must be met. It tends to occur with telic predicates and an omniscient narrator and high probability of the event to materialize in near future enhances the chance of the o-construction to be used. This stand in contrast with more grammaticalized nondums in Austronesian- and Bantu languages where these expressions have a more general meaning and wider applicability. / Uttryck som inte än, redan, fortfarande och inte längre tillhör en kategori som kallas Phasal Polarity (Phasal Polarity), och uttrycker fas, polaritet och talarförväntningar. I europeiska språk förekommer dessa ofta som fasala adverb. I skelleftemålet, talat i Västerbotten, förekommer dock även en annan typ av konstruktion för att uttrycka inte än. Konstruktionen består av hjälpverbet ’hɶ’ tillsammans med supinumformen av verbet och det negativa prefixet o-, till exempel I hɶ oskrive breve ’Jag har inte skrivit brevet än’. Jag kommer att hänvisa till denna konstruktion som o-konstruktionen. Konstruktioner med betydelsen inte än har den senaste tiden kommit att benämnas som nondum (från latinets nondum ’ännu inte’) (Veselinova & Devos, forthcoming) och tycks förekomma i vid utsträckning i grammatikaliserad form i exempelvis austronesiska språk och bantuspråk. Denna o-konstruktion i skelleftemålet är tidigare nämnd men inte vidare beskriven inom existerande litteratur. Den här studien syftar till att samla in data och analysera användningen av den. Insamling av data har skett genom interaktion med talare av skelleftemålet, genom frågeformulär och direkt elicitering. Resultaten visar att o-konstruktionen förekommer i skelleftemålet för att uttrycka inte än, men endast i specifika kontexter, där vissa förutsättningar måste vara uppfyllda. Den tenderar att förekomma med teliska predikat och ett allvetande subjekt, samt hög sannolikhet för eventet att realiseras i nära framtid, ökar chansen att konstruktionen används. Detta står i kontrast till mer grammatikaliserade nondumkonstruktioner i austronesiska språk och bantuspråk, där dessa uttryck har en mer generell betydelse och vidare användningsområde.
186

Nízkofrekvenční reprosoustava s ozvučnicí z alternativních materiálů / Audio speaker system with baffle made from alternative materials

Hüttl, Ondřej January 2008 (has links)
This thesis firstly consists of an analysis and a description of materials suitable for construction of loudspeaker enclosures and a design of the loudspeaker system with the enclosure made of a synthetic stone. Electro – acoustics drivers have been described with a focus to a frequency range, a principle and a construction for an explanation of their function. Drivers based on electrodynamic principle have been described closely. An analysis of materials used for the loudspeaker enclosure production has been done and tradicional and alternative materials have been described as well. Specific modulus, density and stiffness have been taken as an objective evaluative criterion. An evaluation indicates appropriate properties of plastic, aluminum and stone based materials. The function and kinds of loudspeaker enclosures have been described. The function, properties and design of a vented box loudspeaker system have been described closely. The last part of this thesis is individual design of a two-way vented box loudspeaker system with the enclosure made of synthetic stone. A simulation of a driver behavior built in the enclosure and compensation circuits and crossovers design and simulation have been done. The enclosure has been designed with high stress to minimize negative effect of the enclosure to the final sound quality.
187

Neuronové sítě při klasifikaci mluvčích / Neural networks in speaker classification

Svoboda, Libor January 2008 (has links)
The content of this work is focused on the neural network per speaker recognition. The work deals with problems of processing speech signal and there are introduction some types of neural network. The part of work was made database of records from speakers with have various sex and ages. The train and test group was made from the database. For classifier were suggested afterwards. One of them was nominated on base Gaussian mixture model and three of them were nominated on neural. This system was tested and analyzed on the basis of age, gender and both criterions each other at the end. Attention is focused on choice suitable feature in each mission of classification at the same time. At the end of work are introduced results of analysis for individual groups and features. The most suitable features are diagnosed from given mission of classification and the most prosperous classifier.
188

Ukázkový systém na rozpoznávání mluvčích / Demontration System for Speaker Recognition

Šústek, Martin January 2008 (has links)
My diploma theses deals with the problem of the speaker recognition. The basic theory of this problem is described in the text as well as model and implementation of the system for speaker recognition. The scope of the system is to recognize up to three speakers. The theory is based on calculation parameters for speaker recognition and processing of voice. Program is made in Matlab as a independent application and it has got Czech and English interface.
189

Segmentace mluvčích s využitím statistických metod klasifikace / Speaker Segmentation using statistical methods of classification

Adamský, Aleš January 2011 (has links)
The thesis discusses in detail some concepts of speech and prosody that can contribute to build a speech corpus for the speaker segmentation purpose. Moreover, the Elan multimedia annotator used for labeling is described. The theoretical part highlights some frequently used speech features such as MFCC, PLP and LPC and deals with currently most popular speech segmentation methods. Some classification algorithms are also mentioned. The practical part describes implementation of Bayesian information criterium algorithm in system for automatic speaker segmentation. For classification of speaker change point in speech, were used different speech features. The results of tests were evaluated by the graphic method of receiver operating characteristic (ROC) and his quantitative indices. As the best speech features for this system were provided MFCC and HFCC.
190

Speaker Recognition Based on Long Temporal Context / Speaker Recognition Based on Long Temporal Context

Fér, Radek January 2014 (has links)
Tato práce se zabývá extrakcí vhodných příznaků pro rozpoznávání řečníka z delších časových úseků. Po představení současných technik pro extrakci takových příznaků navrhujeme a popisujeme novou metodu pracující v časovém rozsahu fonémů a využívající známou techniku i-vektorů. Velké úsilí bylo vynaloženo na nalezení vhodné reprezentace temporálních příznaků, díky kterým by mohly být systémy pro rozpoznávání řečníka robustnější, zejména modelování prosodie. Náš přístup nemodeluje explicitně žádné specifické temporální parametry řeči, namísto toho používá kookurenci řečových rámců jako zdroj temporálních příznaků. Tuto techniku testujeme a analyzujeme na řečové databázi NIST SRE 2008. Z výsledků bohužel vyplývá, že pro rozpoznávání řečníka tato technika nepřináší očekávané zlepšení. Tento fakt diskutujeme a analyzujeme ke konci práce.

Page generated in 0.039 seconds