1 |
Robust Formant tracking for Continuous Speech with Speaker Variability / Robust Formant tracking for Continuous SpeechMustafa, Kamran 12 1900 (has links)
Exposure to loud sounds can cause damage to the inner ear, leading to degradation of the neural response to speech and to formant frequencies in particular. This may result in decreased intelligibility of speech. An amplification scheme for hearing aids, called Contrast Enhanced Frequency Shaping (CEFS), may improve speech perception for ears with sound-induced hearing damage. CEFS takes into account across-frequency distortions introduced by the impaired ear and requires accurate and robust formant frequency estimates to allow dynamic, speech-spectrum-dependent amplification of speech in hearing aids. Several algorithms have been developed for extracting the formant information from speech signals, however most of these algorithms are either not robust in real-life noise environments or are not suitable for real-time implementation. The algorithm proposed in this thesis achieves formant extraction from continuous speech by using a time-varying adaptive filterbank to track and estimate individual formant frequencies. The formant tracker incorporates an adaptive voicing detector and a gender detector for robust formant extraction from continuous speech, for both male and female speakers in the presence of background noise. Thorough testing of the algorithm using various speech sentences has shown promising results over a wide range of SNRs for various types of background noises, such as AWGN, single and multiple competing background speakers and various other environmental sounds. / Thesis / Master of Applied Science (MASc)
|
2 |
Odhad formantových kmitočtů pomocí strojového učení / Estimation of formant frequencies using machine learningKáčerová, Erika January 2019 (has links)
This Master's thesis deals with the issue of formant extraction. A system of scripts in Matlab interface is created to generate values of the first three formant frequencies from speech recordings with the use of Praat and Snack(WaveSurfer). Mel Frequency Cepstral Coefficients and Linear Predictive Coefficients are extracted from the audio files in order to be added to the database. This database is then used to train a neural network. Finally, the designed neural network is tested.
|
3 |
Organogel à base d'un dérivé de la L-alanine pour la libération prolongée de leuprolide : étude pharmacocinétique et pharmacodynamique chez le ratPlourde, François January 2006 (has links)
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.
|
4 |
An acoustically-driven vocal tract model for stop consonant productionStory, Brad H., Bunton, Kate 03 1900 (has links)
The purpose of this study was to further develop a multi-tier model of the vocal tract area function in which the modulations of shape to produce speech are generated by the product of a vowel substrate and a consonant superposition function. The new approach consists of specifying input parameters for a target consonant as a set of directional changes in the resonance frequencies of the vowel substrate. Using calculations of acoustic sensitivity functions, these "resonance deflection patterns" are transformed into time-varying deformations of the vocal tract shape without any direct specification of location or extent of the consonant constriction along the vocal tract. The configuration of the constrictions and expansions that are generated by this process were shown to be physiologically-realistic and produce speech sounds that are easily identifiable as the target consonants. This model is a useful enhancement for area function-based synthesis and can serve as a tool for understanding how the vocal tract is shaped by a talker during speech production. (C) 2016 Elsevier B.V. All rights reserved.
|
5 |
Effects of Nasalance on the Acoustics of the Tenor Passaggio and Head VoicePerna, Nicholas K. 21 April 2008 (has links)
PERNA, NICHOLAS (D.M.A., Vocal Pedagogy and Performance) Effects of Nasalance on the Acoustical Properties of the (May 2008) Tenor Passaggio and Head Voice Abstract of a doctoral essay at the University of Miami. Doctoral essay supervised by Professor David Alt and Professor Rachel L. Lebon. No. of pages in text. (73) This study aims to measure the effect that nasality has on the acoustical properties of the tenor passaggio and head voice. Not to be confused with forward resonance, nasality here will be defined as nasalance, the reading of a Nasometer, or the percentage of nasal and oral airflow during phonation. A previous study by Peer Birch et. al. has shown that professional tenors used higher percentages of nasalance through their passaggio. They hypothesized that tenors used nasalance to make slight timbral adjustments as they ascended through passaggio. Other well respected authors including Richard Miller and William McIver have claimed that teaching registration issues is the most important component of training young tenors. It seemed logical to measure the acoustic effects of nasalance on the tenor passaggio and head voice. Eight professional operatic tenors participated as subjects performing numerous vocal exercises that demonstrated various registration events. These examples were recorded and analyzed using a Nasometer and Voce Vista Pro Software. Tenors did generally show an increase of nasalance during an ascending B-flat major scale on the vowels [i] and [u]. Perhaps the most revealing result was that six of seven tenors showed at least a 5-10% increase in nasalance on the note after their primary register transition on the vowel of [a]. It is suggested that this phenomenon receive further empirical scrutiny, because, if true, pedagogues could use nasalance as a tool for helping a young tenor ascend through his passaggio.
|
6 |
Examing Listeners' Ability to Perceive Vowel-Inherent Spectral ChangesChiddenton, Kathleen 22 March 2013 (has links)
One family of theories regarding vowel perception suggests onset and offset formant-frequencies are important for identification and that the shape of the transitions themselves are not otherwise perceptually important. The present study determined just-noticeable-differences in deviations from linear formant trajectories. Diphthong-like stimuli were manipulated by inserting a point of inflection into the otherwise linear transition. Several parameters were manipulated including vowel duration, location of the inflection point in time, and fundamental frequency. Data from the first experiment indicate that listeners are largely insensitive to deviations from linearity of formant trajectory but that large enough deviations could eventually be detected. The size of these deviations seems dependent on the range of onset-offset formant frequencies. However, a second experiment in which only the first half of stimuli was presented thereby affecting the frequency range of the stimuli, gave different results. Results from these experiments along with several hypotheses are presented.
|
7 |
Multimodal Targets in Speech Production: Acoustic, Articulatory and Dynamic Eevidence from Formant PerturbationNeufeld, Chris 05 December 2013 (has links)
This thesis presents evidence from a formant perturbation experiment which supports the hypothesis that speech targets are multimodal. A real-time auditory feedback perturbation is used to gradually shift English speakers' formants from the vowel /E/ towards /I/. Most speakers compensate at the level of acoustics, adjusting their production towards /ae/ such that they hear themselves producing the correct vowel. Subjects' articulation is tracked with electromagnetic-articulography. The articulatory data shows that subjects tend to produce marginal /E/s at the level of articulation - remaining within the normal articulatory bounds for that vowel, while adjusting the position of individual articulators to a sufficient extent to create an acoustic compensation to the perturbation. The higher-order relationship between speed and curvature is shown to differ across different vowel phonemes. However, this measure remains constant under formant perturbation. These findings are argued to show that phonemic targets are multi-modal, having acoustical, kinematic, and dynamic components.
|
8 |
Multimodal Targets in Speech Production: Acoustic, Articulatory and Dynamic Eevidence from Formant PerturbationNeufeld, Chris 05 December 2013 (has links)
This thesis presents evidence from a formant perturbation experiment which supports the hypothesis that speech targets are multimodal. A real-time auditory feedback perturbation is used to gradually shift English speakers' formants from the vowel /E/ towards /I/. Most speakers compensate at the level of acoustics, adjusting their production towards /ae/ such that they hear themselves producing the correct vowel. Subjects' articulation is tracked with electromagnetic-articulography. The articulatory data shows that subjects tend to produce marginal /E/s at the level of articulation - remaining within the normal articulatory bounds for that vowel, while adjusting the position of individual articulators to a sufficient extent to create an acoustic compensation to the perturbation. The higher-order relationship between speed and curvature is shown to differ across different vowel phonemes. However, this measure remains constant under formant perturbation. These findings are argued to show that phonemic targets are multi-modal, having acoustical, kinematic, and dynamic components.
|
9 |
Voice Parameters That Result in Identification or Misidentification of Biological Gender in Male-to-Female Transgender VeteransKing, Robert S., Brown, George R., McCrea, Christopher R. 01 May 2012 (has links)
The objective of this study was to examine the voices of male-to-female (MtF) transgender veterans and biological females that can result in identification or misidentification of biological gender. Twenty-one MtF transgender veterans and 9 cis-gender females were enrolled. The interaction of speaking fundamental frequency (SFo) and formant (resonatory) frequencies in gender discrimination was investigated. The results indicated that an average SFo above 180 Hz and maintaining a speaking pitch range of approximately 140 to 300 Hz appear to be the most powerful acoustic features or markers in the perception of a female voice in a biological male (M. L. Brown & Rounsley, 1996). An SFo of approximately 170 Hz appears to be the lower limit that would result in a biological male being perceived as having a female voice by most listeners. A slight elevation in the second (F2) and third (F3) formants was noted but does not appear to have a significant influence in the perception of a female voice in biological males. Female voices appear to be perceived as male by most listeners if average SFo is at or below 165 Hz, the low SFo is below 130 Hz, and a low F3 is exhibited. No evidence was found that jitter (frequency perturbation) and shimmer (amplitude perturbation) affect the perception of a female or male voice in a biological male. The results support previous research that elevated pitch is the strongest acoustic marker in the perception of a female voice in biological males.
|
10 |
Patterns of anticipatory coarticulation in adults and typically developing childrenBoucher, Kurtt R. 26 June 2007 (has links) (PDF)
Coarticulation is the kinematic and spectral overlap between adjacent sounds during speech production. Coarticulation patterns in typical adults have been well established; however, the manner in which coarticulation is developed in children is still unclear. Research has provided conflicting views, showing that children exhibit more, less, or an equal degree of coarticulation when compared to adult speakers. Considering the divergent findings present in the literature regarding coarticulation in children, the purpose of the present study is to further investigate anticipatory coarticulation in typically developing young children between the ages of three and six years. This study focuses on the acoustic characteristics of an unstressed vowel, the schwa, prior to a series of real words. Results indicate that children exhibit adult-like patterns of coarticulation even at a relatively young age. However, the degree of anticipatory coarticulation is dependent upon the phonemic context, with greater differences being evident in a fricative context and less when followed by a stop consonant.
|
Page generated in 0.0634 seconds