Global ETD Search

21	Pronunciation support for Arabic learners Alsabaan, Majed Soliman K. January 2015 (has links) The aim of the thesis is to find out whether providing feedback to Arabic language learners will help them improve their pronunciation, particularly of words involving sounds that are not distinguished in their native languages. In addition, it aims to find out, if possible, what type of feedback will be most helpful. In order to achieve this aim, we developed a computational tool with a number of component sub tools. These tools involve the implementation of several substantial pieces of software. The first task was to ensure the system we were building could distinguish between the more challenging sounds when they were produced by a native speaker, since without that it will not be possible to classify learners’ attempts at these sounds. To this end, a number of experiments were carried out with the hidden Markov model toolkit (the HTK), a well known speech recognition toolkit, in order to ensure that it can distinguish between the confusable sounds, i.e. the ones that people have difficulty with. The developed computational tool analyses the differences between the user’s pronunciation and that of a native speaker by using grammar of minimal pairs, where each utterance is treated as coming from a family of similar words. This provides the ability to categorise learners’ errors - if someone is trying to say cat and the recogniser thinks they have said cad then it is likely that they are voicing the final consonant when it should be unvoiced. Extensive testing shows that the system can reliably distinguish such minimal pairs when they are produced by a native speaker, and that this approach does provide effective diagnostic information about errors. The tool provides feedback in three different sub-tools: as an animation of the vocal tract, as a synthesised version of the target utterance, and as a set of written instructions. The tool was evaluated by placing it in a classroom setting and asking 50 Arabic students to use the different versions of the tool. Each student had a thirty minute session with the tool, working their way through a set of pronunciation exercises at their own pace. The results of this group showed that their pronunciation does improve over the course of the session, though it was not possible to determine whether the improvement is sustained over an extended period. The evaluation was done from three points of view: quantitative analysis, qualitative analysis, and using a questionnaire. Firstly, the quantitative analysis gives raw numbers telling whether a learner had improved their pronunciation or not. Secondly, the qualitative analysis shows a behaviour pattern of what a learner did and how they used the tool. Thirdly, the questionnaire gives feedback from learners and their comments about the tool. We found that providing feedback does appear to help Arabic language learners, but we did not have enough data to see which form of feedback is most helpful. However, we provided an informative analysis of behaviour patterns to see how Arabic students used the tool and interacted with it, which could be useful for more data analysis. 492.7
22	CROSS-RACIAL STUDIES OF HUMAN VOCAL TRACT DIMENSIONS AND FORMANT STRUCTURES Hao, Jianping 19 August 2002 (has links) No description available. Health Sciences, Speech Pathology Race Vocal Tract Dimension Formant Frequency Genoter
23	Bandwidths of vocal tract resonances in physical models compared to transmission-line simulations Birkholz, Peter, Blandin, Remi, Kürbis, Steffen 10 January 2025 (has links) This study investigated how the bandwidths of resonances simulated by transmission-line models of the vocal tract compare to bandwidths measured from physical three-dimensional printed vowel resonators. Three types of physical resonators were examined: models with realistic vocal tract shapes based on Magnetic Resonance Imaging (MRI) data, straight axisymmetric tubes with varying cross-sectional areas, and two-tube approximations of the vocal tract with notched lips. All physical models had hard walls and closed glottis so the main loss mechanisms contributing to the bandwidths were sound radiation, viscosity, and heat conduction. These losses were accordingly included in the simulations, in two variants: A coarse approximation of the losses with frequency-independent lumped elements, and a detailed, theoretically more precise loss model. Across the examined frequency range from 0 to 5 kHz, the resonance bandwidths increased systematically from the simulations with the coarse loss model to the simulations with the detailed loss model, to the tube-shaped physical resonators, and to the MRI-based resonators. This indicates that the simulated losses, especially the commonly used approximations, underestimate the real losses in physical resonators. Hence, more realistic acoustic simulations of the vocal tract require improved models for viscous and radiation losses. info:eu-repo/classification/ddc/530 ddc:530
24	Investigation of the influence of the torso, lips and vocal tract configuration on speech directivity using measurements from a custom head and torso simulator Blandin, Rémi, Geng, Jingyan, Birkholz, Peter 10 January 2025 (has links) The human voice is a directional sound source. This property has been explored for more than 200 years, mainly using measurements of human participants. Some efforts have been made to understand the anatomical parameters that influence speech directivity, e.g., the mouth opening, diffraction and reflections due to the head and torso, the lips and the vocal tract. However, these parameters have mostly been studied separately, without being integrated into a complete model or replica. The aim of this work was to study the combined influence of the torso, the lips and the vocal tract geometry on speech directivity. For this purpose, a simplified head and torso simulator was built; this simulator made it possible to vary these parameters independently. It consisted of two spheres representing the head and the torso into which vocal tract replicas with or without lips could be inserted. The directivity patterns were measured in an anechoic room with a turntable and a microphone that could be placed at different angular positions. Different effects such as torso diffraction and reflections, the correlation of the mouth dimensions with directionality, the higher-order modes and the increase in directionality due to the lips were confirmed and further documented. Interactions between the different parameters were found. It was observed that torso diffraction and reflections were enhanced by the presence of the lips, that they could be modified or masked by the effect of higher-order modes and that the lips tend to attenuate the effect of higher-order modes. info:eu-repo/classification/ddc/530 ddc:530
25	How to precisely measure the volume velocity transfer function of physical vocal tract models by external excitation Fleischer, Mario, Mainka, Alexander, Kürbis, Steffen, Birkholz, Peter 30 July 2018 (has links) (PDF) Recently, 3D printing has been increasingly used to create physical models of the vocal tract with geometries obtained from magnetic resonance imaging. These printed models allow measuring the vocal tract transfer function, which is not reliably possible in vivo for the vocal tract of living humans. The transfer functions enable the detailed examination of the acoustic effects of specific articulatory strategies in speaking and singing, and the validation of acoustic plane-wave models for realistic vocal tract geometries in articulatory speech synthesis. To measure the acoustic transfer function of 3D-printed models, two techniques have been described: (1) excitation of the models with a broadband sound source at the glottis and measurement of the sound pressure radiated from the lips, and (2) excitation of the models with an external source in front of the lips and measurement of the sound pressure inside the models at the glottal end. The former method is more frequently used and more intuitive due to its similarity to speech production. However, the latter method avoids the intricate problem of constructing a suitable broadband glottal source and is therefore more effective. It has been shown to yield a transfer function similar, but not exactly equal to the volume velocity transfer function between the glottis and the lips, which is usually used to characterize vocal tract acoustics. Here, we revisit this method and show both, theoretically and experimentally, how it can be extended to yield the precise volume velocity transfer function of the vocal tract. Vokaltraktübertragungsfunktion Volumengeschwindigkeit Resonanzen Sinus piriformis Vokaltraktgeometrien TU Dresden Publikationsfond vocal tract transfer function volume velocity resonances piriform sinus vocal tract geometries TU Dresden Publishing Fund ddc:610 rvk:XA 10000
26	How to precisely measure the volume velocity transfer function of physical vocal tract models by external excitation Fleischer, Mario, Mainka, Alexander, Kürbis, Steffen, Birkholz, Peter 30 July 2018 (has links) Recently, 3D printing has been increasingly used to create physical models of the vocal tract with geometries obtained from magnetic resonance imaging. These printed models allow measuring the vocal tract transfer function, which is not reliably possible in vivo for the vocal tract of living humans. The transfer functions enable the detailed examination of the acoustic effects of specific articulatory strategies in speaking and singing, and the validation of acoustic plane-wave models for realistic vocal tract geometries in articulatory speech synthesis. To measure the acoustic transfer function of 3D-printed models, two techniques have been described: (1) excitation of the models with a broadband sound source at the glottis and measurement of the sound pressure radiated from the lips, and (2) excitation of the models with an external source in front of the lips and measurement of the sound pressure inside the models at the glottal end. The former method is more frequently used and more intuitive due to its similarity to speech production. However, the latter method avoids the intricate problem of constructing a suitable broadband glottal source and is therefore more effective. It has been shown to yield a transfer function similar, but not exactly equal to the volume velocity transfer function between the glottis and the lips, which is usually used to characterize vocal tract acoustics. Here, we revisit this method and show both, theoretically and experimentally, how it can be extended to yield the precise volume velocity transfer function of the vocal tract. info:eu-repo/classification/ddc/610 ddc:610
27	Análise da voz e do comportamento do trato vocal suplaglótico por meio visual, perceptivo-auditivo e acústico em mulheres disfônicas com diferentes configurações glóticas / Analysis of voice and supraglottic vocal behavior using visual, perceptual and acoustic methods in dysphonic women whith different glottic configurations Nunes, Raquel Buzelin 26 August 2005 (has links) Made available in DSpace on 2016-04-27T18:11:46Z (GMT). No. of bitstreams: 1 Dissertacaopucsp.pdf: 501521 bytes, checksum: 927a821ea9e9a7438ddcf5557568c063 (MD5) Previous issue date: 2005-08-26 / Objective: To analyze voice and supraglottic vocal tract using imaging of the vocal tract, perceptual and acoustic methods in dysphonic women with different glottic configurations. Method: The studied sample comprised 31 women, aged 20 to 45 years, with vocal complaints and affections. Subjects underwent brief assessment of sensorial-motor-oral system, videolaryngostroboscopy and nasofibrolaryngoscopy, and vocal recording. First of all, the otorrinolaryngologist reviewed the medical results and selected exams that presented similar laryngeal affections, forming three distinct groups (bilateral nodules, mucosa cover damage and chinks). A VHS tape was edited with nasofibrolaryngoscopic exams for visual analysis of the vocal tract and a CD with the voices was edited for perceptual and acoustic analysis. Visual analysis was jointly carried out by three speech therapists and three otorrinolaryngologists that checked the following parameters: supraglottic constriction, vertical laryngeal mobility, pharyngeal constriction and tongue mobility. Perceptual analysis was carried out by the same speech therapists using the Vocal Profile Analysis Scheme (VPAS) and assessment of parameters of pitch, loudness and resonance. Values of the frequency of formants were extracted from CSL software (Kay Elemetrics). Data from visual, perceptual and acoustic analyses were statistically analyzed. Results: The results of the visual analysis did not show statistically significant differences that could distinguish the groups. Thus, the results were described for a total group of 31 subjects. Out of the analyzed parameters, we noticed that supraglottic constriction was present in 21 subjects. Laryngeal vertical mobility was average in 12 subjects and restricted in 16. Sixteen subjects presented pharyngeal constriction - nine were circular, seven were lateral. The position of the tongue remained neutral in 27 subjects. Analysis of vocal quality using VPAS revealed that the following findings were the most predominant: high larynx position, labial-dental position, stretched lips, closed mandible, advanced tip of the tongue, and phonation adjustments of breathiness, harsh voice and hyperfunction. Cluster factor analysis showed that supraglottic adjustments were determinant for the characterization of vocal quality. The pitch ranged from low and medium to low in 28 analyzed subjects. Loudness level was high in 23 subjects and resonance was low in 20 subjects. In the acoustic analysis, values of formants were within the normal reference range. Conclusion: Vocal quality is not exclusively related with specific glottic affection, but rather with behaviors of the whole vocal tract, evidencing the need to consider supraglottic adjustments in the characterization of vocal quality of dysphonic subjects / Objetivo: análise do comportamento do trato vocal supraglótico, por meio de avaliação visual da imagem do trato, perceptivo-auditiva e acústica da voz em mulheres disfônicas, com diferentes configurações glóticas. Método: O estudo constituiu-se de 31 mulheres, faixa etária entre 20 e 45 anos, com queixa e alteração vocal. Os indivíduos foram submetidos à avaliação resumida do sistema sensório-motor-oral, exames de videolaringoestroboscopia, nasofibrolaringoscopia e gravação da voz. Em um primeiro momento, o otorrinolaringologista selecionou, dentro dos laudos correspondentes, os exames que continham alterações laríngeas semelhantes, formando três grupos distintos: nódulos bilaterais, lesão de cobertura e fenda. Desses grupos foi editada uma fita VHS com os exames de nasofibrolaringoscopia para análise visual do trato vocal, e um cd com as vozes, para análise perceptivo-auditiva e acústica. A análise visual do trato supraglótico foi realizada, em consenso, por três fonoaudiólogas e três otorrinolaringologistas, que verificaram os parâmetros de: constrição supraglótica, mobilidade vertical da laringe, constrição faríngea e mobilidade de língua. A análise perceptivo-auditiva foi realizada pelas mesmas fonoaudiólogas, por meio do protocolo de motivação fonética (VPAS) e avaliação dos parâmetros de pitch, loudness e ressonância. Os valores das freqüências dos formantes foram extraídos do programa CSL (Kay Elemetrics). Os dados das análises visual, perceptivo-auditiva e acústica foram tratados estatisticamente. Resultados: dos resultados da análise visual não foi encontrada diferença estatística significante que separasse os grupos das alterações glóticas. Assim, os resultados foram descritos para o grupo total de 31 indivíduos. Dos parâmetros analisados observou-se que constrição supraglótica esteve presente em 21 indivíduos. A mobilidade vertical da laringe variou entre média, em 12 indivíduos, e restrita, em 16. Dezesseis indivíduos apresentaram constrição faríngea sendo nove do tipo circular e sete lateral. A posição de língua permaneceu neutra em 27 indivíduos. A análise da qualidade vocal por meio do protocolo de motivação fonética (VPAS) revelou que os ajustes de laringe alta, labiodentalização, lábios estirados, mandíbula fechada, ponta de língua avançada e ajustes fonatórios de soprosidade, aspereza e hiperfunção foram os que predominaram. Os dados da análise fatorial de cluster mostraram que os ajustes na esfera supraglótica foram determinantes na caracterização da qualidade vocal. Em relação ao picth este variou de grave a médio para grave em 28 indivíduos analisados. O loudness apresentou-se forte em 23 indivíduos e a ressonância baixa foi observada em 20 deles. Na análise acústica, os valores dos formantes encontraram-se dentro dos valores de referência. Conclusão: a qualidade da voz não está relacionada exclusivamente a uma alteração glótica específica, mas com o comportamento que ocorre em todo trato vocal. Evidencia-se, assim, a necessidade de se considerar os ajustes supraglóticos na caracterização da qualidade vocal dos indivíduos disfônicos trato vocal qualidade vocal disfonia Distúrbios da voz Qualidade da voz vocal tract voice quality dysphonia CNPQ::CIENCIAS DA SAUDE::FONOAUDIOLOGIA
28	Time-Varying Modeling of Glottal Source and Vocal Tract and Sequential Bayesian Estimation of Model Parameters for Speech Synthesis January 2018 (has links) abstract: Speech is generated by articulators acting on a phonatory source. Identification of this phonatory source and articulatory geometry are individually challenging and ill-posed problems, called speech separation and articulatory inversion, respectively. There exists a trade-off between decomposition and recovered articulatory geometry due to multiple possible mappings between an articulatory configuration and the speech produced. However, if measurements are obtained only from a microphone sensor, they lack any invasive insight and add additional challenge to an already difficult problem. A joint non-invasive estimation strategy that couples articulatory and phonatory knowledge would lead to better articulatory speech synthesis. In this thesis, a joint estimation strategy for speech separation and articulatory geometry recovery is studied. Unlike previous periodic/aperiodic decomposition methods that use stationary speech models within a frame, the proposed model presents a non-stationary speech decomposition method. A parametric glottal source model and an articulatory vocal tract response are represented in a dynamic state space formulation. The unknown parameters of the speech generation components are estimated using sequential Monte Carlo methods under some specific assumptions. The proposed approach is compared with other glottal inverse filtering methods, including iterative adaptive inverse filtering, state-space inverse filtering, and the quasi-closed phase method. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2018 Electrical engineering Acoustic to articulatory inversion articulatory synthesis blind deconvolution Glottal inverse filtering speech synthesis Vocal tract estimation
29	An Acoustically Oriented Vocal-Tract Model ITAKURA, Fumitada, TAKEDA, Kazuya, YEHIA, Hani C. 20 August 1996 (has links) No description available. articulatory-to-acoustic inverse problem singular value decomposition independent component analysis factor analysis formant frequencies vocal-tract log-area function
30	Theoretical Framework for Modeling Ingressive Phonation Brougham, Michael V Unknown Date No description available. Aeroacoustic coupling Modal analysis Aerodynamic state feedback Vocal tract and subglottal resonators Mucosal wave

Search results