Global ETD Search

31	Určení věku a pohlaví mluvčích / Establishing speaker's age and sex Rendek, Tomáš January 2010 (has links) This work deals with speaker´s age and gender recognition. At the beginning it introduces the practical usage of this application and discusses the solutions available. The theoretical part of the thesis specifies the feature extraction and reduction methods and speech databases used in the experiments. The practical part describes the recognizer implemented in the Emotional tool and in two chapters describes the individual experiments. Regarding speaker´s gender estimation; we focused on the impact of the emotional state and speaker's age on the classification process. The two remain experiments were dedicated for general gender estimation performed by using two different classifiers – GMM and k-NN. These two classifiers were used in age estimation as well. In this case, four Group of age was formed and two different feature sets namely: segmental and suprasegmental were exploited four groups
32	Vowel perception in severe noise Swanepoel, Rikus 05 March 2013 (has links) A model that can accurately predict speech recognition for cochlear implant (CI) listeners is essential for the optimal fitting of cochlear implants. By implementing a CI acoustic model that mimics CI speech processing, the challenge of predicting speech perception in cochlear implants can be simplified. As a first step in predicting the recognition of speech processed through an acoustic model, vowel perception in severe speech-shaped noise was investigated in the current study. The aim was to determine the acoustic cues that listeners use to recognize vowels in severe noise and make suggestions regarding a vowel perception predictor. It is known that formants play an important role in quiet, while in severe noise the role of formants is still unknown. The relative importance of F1 and F2 is also of interest, since the masking of noise is not always evenly distributed over the vowel spectrum. The problem was addressed by synthesizing vowels consisting of either detailed spectral shape or formant information. F1 and F2 were also suppressed to examine the effect in severe noise. The synthetic stimuli were presented to listeners in quiet and signal-to-noise ratios of 0 dB, -5 dB and -10 dB. Results showed that in severe noise, vowels synthesized according to the whole-spectrum were recognized significantly better than vowels containing only formants. Multidimensional scaling and FITA analysis indicated that formants were still perceived and extracted by the human auditory system in severe noise, especially when the vowel spectrum consisted of the whole spectral shape. Although F1 and F2 vary in importance in listening conditions of quiet and less noisy conditions, the role of the two cues appears to be similar in severe noise. It was suggested that not only the availability formants, but also details of the vowel spectral shape can help to predict vowel recognition in severe noise to a certain degree. / Dissertation (MEng)--University of Pretoria, 2010. / Electrical, Electronic and Computer Engineering / unrestricted Formant Multidimensional scaling Acoustic model Cochlear implant Speech-shaped noise Acoustic cues UCTD
33	The Relationship Between Acoustic and Kinematic Measures of Diphthong Production Jang, Gwi-Ok 29 June 2010 (has links) (PDF) The purpose of this study was to examine the correlation between acoustic and kinematic measures of diphthong production in 11 individuals with multiple sclerosis (MS) and 11 neurologically healthy control speakers. The participants produced four diphthongs: /ɔɪ/, /aʊ/, /aɪ/, /eɪ/. These sounds were spoken in a sentence context. Their speech audio signal was recorded with a microphone and their tongue movements were recorded with a magnetic tracking system. The first and second formants (F1 and F2) were computed with acoustic analysis software, and these signals were time-aligned with the vertical and anteroposterior magnet movement records. Pearson correlations between F1 and the magnet's vertical movement and between F2 and anteroposterior movement were computed for the individual diphthongs. The results of this study revealed an often non-linear relationship between the acoustic and kinematic measures. The degree to which the formant measures predicted the lingual movements varied across speakers and also during the on-glide, transition, and off-glide phases of the diphthongs. The findings of this study suggest that the relationship between formants and tongue movements is more complex than would be predicted from the theoretical origins of F1 and F2. Thus, researchers should be aware that acoustic parameters might not always accurately reflect the physical movements of articulators. acoustic analysis kinematic analysis multiple sclerosis MS formant tongue movement diphthong Communication Sciences and Disorders
34	CROSS-RACIAL STUDIES OF HUMAN VOCAL TRACT DIMENSIONS AND FORMANT STRUCTURES Hao, Jianping 19 August 2002 (has links) No description available. Health Sciences, Speech Pathology Race Vocal Tract Dimension Formant Frequency Genoter
35	Vowel Production Abilities Of Haitian American Children Wallen, Stacey V. 12 September 2008 (has links) No description available. Acoustics Linguistics Multicultural Education Speech Therapy Haitian Kreyol vowels formant frequencies
36	The Influence of Human Facial and Vocal Features on Social Perceptions of Attractiveness, Dominance, and Leadership Ability Tigue, Cara 11 1900 (has links) Research shows that human facial and vocal features influence social perceptions of attractiveness and dominance. In general, more feminine facial and vocal features are perceived as more attractive in women and more masculine facial and vocal features are perceived as more attractive in men. More masculine facial and vocal features are generally perceived as more dominant in both women and men. Given that attractiveness and dominance closely relate to inter- and intra-sexual selection, respectively, and that leaders can influence an individual’s fitness, humans likely possess evolved mechanisms for assessing leadership ability. Thus, in prior work, facial and vocal features have been related to perceptions of leadership ability. In this dissertation, I address three previously unanswered questions. First, how do vocal acoustics influence perceptions of leaders and voting preferences? Second, how do vocal acoustics influence perceptions of leaders in different social contexts? Third, how do different methods of stimuli presentation influence the results of studies on face and voice perception? Herein, I demonstrate that participants prefer to vote for lower pitched men’s voices, and that it is unclear precisely how women’s voice pitch influences voting preferences. I also show that the influence of voice pitch on perceptions of leaders depends on the social context. Third, I establish that several methods of stimuli presentation are equally valid to use in studies on face and voice perception. Overall, the studies in this dissertation demonstrate that facial and vocal features influence perceptions of attractiveness, dominance, and leadership ability in a potentially adaptive manner. / Thesis / Doctor of Philosophy (PhD) face voice attractiveness dominance leadership masculine feminine pitch formant perception acoustic vote politics war peace
37	Suivi de formants par analyse en multirésolution / Formant tracking by Multiresolution Analysis Jemâa, Imen 19 February 2013 (has links) Nos travaux de recherches présentés dans ce manuscrit ont pour objectif, l'optimisation des performances des algorithmes de suivi des formants. Pour ce faire, nous avons commencé par l'analyse des différentes techniques existantes utilisées dans le suivi automatique des formants. Cette analyse nous a permis de constater que l'estimation automatique des formants reste délicate malgré l'emploi de diverses techniques complexes. Vue la non disponibilité des bases de données de référence en langue arabe, nous avons élaboré un corpus phonétiquement équilibré en langue arabe tout en élaborant un étiquetage manuel phonétique et formantique. Ensuite, nous avons présenté nos deux nouvelles approches de suivi de formants dont la première est basée sur l'estimation des crêtes de Fourier (maxima de spectrogramme) ou des crêtes d'ondelettes (maxima de scalogramme) en utilisant comme contrainte de suivi le calcul de centre de gravité de la combinaison des fréquences candidates pour chaque formant, tandis que la deuxième approche de suivi est basée sur la programmation dynamique combinée avec le filtrage de Kalman. Finalement, nous avons fait une étude exploratrice en utilisant notre corpus étiqueté manuellement comme référence pour évaluer quantitativement nos deux nouvelles approches par rapport à d'autres méthodes automatiques de suivi de formants. Nous avons testé la première approche par détection des crêtes ondelette, utilisant le calcul de centre de gravité, sur des signaux synthétiques ensuite sur des signaux réels de notre corpus étiqueté en testant trois types d'ondelettes complexes (CMOR, SHAN et FBSP). Suite à ces différents tests, il apparaît que le suivi de formants et la résolution des scalogrammes donnés par les ondelettes CMOR et FBSP sont meilleurs qu'avec l'ondelette SHAN. Afin d'évaluer quantitativement nos deux approches, nous avons calculé la différence moyenne absolue et l'écart type normalisée. Nous avons fait plusieurs tests avec différents locuteurs (masculins et féminins) sur les différentes voyelles longues et courtes et la parole continue en prenant les signaux étiquetés issus de la base élaborée comme référence. Les résultats de suivi ont été ensuite comparés à ceux de la méthode par crêtes de Fourier en utilisant le calcul de centre de gravité, de l'analyse LPC combinée à des bancs de filtres de Mustafa Kamran et de l'analyse LPC dans le logiciel Praat. D'après les résultats obtenus sur les voyelles /a/ et /A/, nous avons constaté que le suivi fait par la méthode ondelette avec CMOR est globalement meilleur que celui des autres méthodes Praat et Fourier. Cette méthode donne donc un suivi de formants (F1, F2 et F3) pertinent et plus proche de suivi référence. Les résultats des méthodes Fourier et ondelette sont très proches dans certains cas puisque toutes les deux présentent moins d'erreurs que la méthode Praat pour les cinq locuteurs masculins ce qui n'est pas le cas pour les autres voyelles où il y a des erreurs qui se présentent parfois sur F2 et parfois sur F3. D'après les résultats obtenus sur la parole continue, nous avons constaté que dans le cas des locuteurs masculins, les résultats des deux nouvelles approches sont notamment meilleurs que ceux de la méthode LPC de Mustafa Kamran et ceux de Praat même si elles présentent souvent quelques erreurs sur F3. Elles sont aussi très proches de la méthode par détection de crêtes de Fourier utilisant le calcul de centre de gravité. Les résultats obtenus dans le cas des locutrices féminins confirment la tendance observée sur les locuteurs / Our research work presented in this thesis aims the optimization of the performance of formant tracking algorithms. We began by analyzing different existing techniques used in the automatic formant tracking. This analysis showed that the automatic formant estimation remains difficult despite the use of complex techniques. For the non-availability of database as reference in Arabic, we have developed a phonetically balanced corpus in Arabic while developing a manual phonetic and formant tracking labeling. Then we presented our two new automatic formant tracking approaches which are based on the estimation of Fourier ridges (local maxima of spectrogram) or wavelet ridges (local maxima of scalogram) using as a tracking constraint the calculation of center of gravity of a set of candidate frequencies for each formant, while the second tracking approach is based on dynamic programming combined with Kalman filtering. Finally, we made an exploratory study using manually labeled corpus as a reference to quantify our two new approaches compared to other automatic formant tracking methods. We tested the first approach based on wavelet ridges detection, using the calculation of the center of gravity on synthetic signals and then on real signals issued from our database by testing three types of complex wavelets (CMOR, SHAN and FBSP). Following these tests, it appears that formant tracking and scalogram resolution given by CMOR and FBSP wavelets are better than the SHAN wavelet. To quantitatively evaluate our two approaches, we calculated the absolute difference average and standard deviation. We made several tests with different speakers (male and female) on various long and short vowels and continuous speech signals issued from our database using it as a reference. The formant tracking results are compared to those of Fourier ridges method calculating the center of gravity, LPC analysis combined with filter banks method of Kamran.M and LPC analysis integrated in Praat software. According to the results of the vowels / a / and / A /, we found that formant tracking by the method with wavelet CMOR is generally better than other methods. Therefore, this method provides a correct formant tracking (F1, F2 and F3) and closer to the reference. The results of Fourier and wavelet methods are very similar in some cases since both have fewer errors than the method Praat. These results are proven for the five male speakers which is not the case for the other vowels where there are some errors which are present sometimes in F2 and sometimes in F3. According to the results obtained on continuous speech, we found that in the case of male speakers, the result of both approaches are particularly better than those of Kamran.M method and those of Praat even if they are often few errors in F3. They are also very close to the Fourier ridges method using the calculation of center of gravity. The results obtained in the case of female speakers confirm the trend observed over the male speakers Parole Acoustique Représentation temps-fréquence Crêtes de Fourier Spectrogramme Crêtes d'ondelettes Scalogramme Suivi de formant Centre de gravité Programmation dynamique Filtrage de Kalman Speech Acoustic Time-frequency representation Fourier ridges Wavelet ridges Spectrogram Sclogram Formant tracking Centre of gravity Dynamic programming Kalman filtering 006.454 414
38	L’empreinte du septénaire : mise en discours et énonciation, Genèse 1-11 et Apocalypse 5-8 / The footprint of the septenary : discursivization and enunciation, Genesis 1-11 and 5-8 Apocalypse Giroud, Jean-Claude 16 April 2014 (has links) De quelle mise en discours procède l’ensemble des récits pluriels qui narrent les « origines » dans le livre de la Genèse de la littérature biblique pour composer ce qui est présenté comme un « cycle » ordonné ? Cette question conduit à formuler l’hypothèse selon laquelle le premier récit de création ou « septénaire des jours » constitue, tel un paradigme, un système apte à ordonner le déploiement des récits suivants (Gn 2 à 11). Surtout, l’originalité de ce système est de mettre en place des mécanismes d’orientation vers une instance d’énonciation, constituant ainsi une « empreinte ». Et la mise en discours des textes qui suivent dispose, par les parcours discursifs, des « formants-signifiants » d’ordre « figural » propres à rappeler cette empreinte et à fonctionner comme autant d’indicateurs de l’instance d’énonciation. Enfin, au terme du livre biblique, le septénaire des « sceaux » (Apocalypse 5-8) vient réexposer le paradigme, récapituler le « figural », renouveler l’indication de l’instance d’énonciation et redéfinir l’orientation vers cette instance. La mise en discours opère ainsi, par le figural, un véritable « nouage » entre les grandeurs figuratives et l’instance d’énonciation. En son figural, la figure devient achoppement balisant la lecture, orientant le lecteur pour le conduire au plus près des postures des sujets de l’instance d’énonciation. / What procedure of discursivization makes it possible to present the ensemble of varied narratives recounting “origins,” in the book of Genesis of the biblical literature, as an orderly “cycle”? This question leads to the formulation of the hypothesis that the first story of creation, the “septenary of days,” constitutes, as a paradigm, a system capable of structuring the deployment of the following narratives (Gn 2 – 11). The originality of this system consists, above all, in putting in place mechanisms of orientation toward an instance of enunciation which constitutes a “footprint.” Then, the discursivization of the subsequent texts, by means of the discursive path, employs “formant-signifiers” of the “figural” order, likely to recall this footprint and to function as so many indicators of the instance of enunciation. Finally, at the end of the biblical literature, the “septenary of seals” in the book of Revelation (Apocalypse 5-8) reexposes the paradigm, recapitulates the figural, renews the indication of the instance of enunciation, and redefines the orientation toward this instance. Thus, by means of the figural, discursivization ties a veritable “knot” between figurative values and instance of enunciation. In its figural, the figure becomes a stumbling stone along the path of the reading which orients the reader, leading him as close as possible to the positions of the subjects of the instance of enunciation. Sémiotique Mise en discours Enonciation Figure Figural Parcours figuratif Parcours discursif Formant Signifiant Paradigme Littérature biblique Semiotics Discursivization Enunciation Figure Figural Figurative path Discursive path Formant Signifier Paradigm Biblical literature
39	Harmonický posun výšky tónu / Harmonic pitch shifting Cihelková, Tereza January 2011 (has links) This thesis describes the design and implementation of audio effect for pitch shifting of monophonic singing signals. The effect can generate two pitch shifted voices from input signal in real-time, while preserving formants. Amount of the shift can be controlled via MIDI controller. The effect is implemented as VST module in the form of dynamic-link library. This work also includes theoretical introduction to related DSP techniques.
40	Mécanismes laryngés et voyelles en voix chantée. Dynamique vocale, phonétogrammes de paramètres glottiques et spectraux, transitions de mécanismes. Lamesch, Sylvain 18 January 2010 (has links) (PDF) Cette thèse porte sur l'influence de la voyelle sur les mécanismes laryngés (M1 et M2) en voix chantée. Nous avons observé que les chanteurs associent le /a/ à M1 et le /i/ à M2. Nous avons alors cherché des corrélats physiologiques et acoustiques en étudiant l'influence des voyelles sur les limites phonétographiques, sur plusieurs paramètres de source et spectraux ainsi que sur les transitions des mécanismes. La limite supérieure des phonétogrammes est de 10 dB plus intense pour /a/ que pour /i/ en M1, mais pas en M2. Le phonétogramme de M2 est donc décalé, par rapport à celui de M1, vers les faibles niveaux pour /a/ mais pas pour /i/. Ce décalage est dû en partie à la différence de valeurs de quotient ouvert entre M1 et M2. De plus, l'amplitude du signal électroglottographique augmente avec l'intensité et est plus grande pour /i/ que pour /a/, révélant des différences glottiques de production de voyelles à mêmes hauteur et intensité. Les liens entre les voyelles et la position verticale du larynx dépendent de l'expertise vocale des chanteurs. L'étude de la répartition de l'énergie spectrale est effectuée en calculant le rapport de l'énergie (ER) de la bande du formant du chanteur (FB2) ou des hautes fréquences (FB3) à l'énergie totale. Il est possible d'obtenir un formant du chanteur aussi intense en M2 qu'en M1. ER(FB2) peut saturer à haut niveau, en fonction de la voyelle, du mécanisme et de l'expertise vocale. ER(FB3) est plus faible en M2 qu'en M1. L'intervalle fréquentiel des sauts M1->M2 augmente avec l'intensité mais pas avec la hauteur. Ceci n'est pas observé dans le sens M2->M1. La fréquence de déclenchement de la transition est plus basse pour /i/ que pour /a/. voix chantée mécanisme laryngé registre voyelle phonétogramme source glottique formant du chanteur saut de fréquence

Search results