Global ETD Search

1	A comparative analysis of gaussian mixture models and i-vector for speaker verification under mismatched conditions Avila, Anderson Raymundo January 2014 (has links) Orientador: Prof. Dr. Francisco J. Fraga / Dissertação (mestrado) - Universidade Federal do ABC, Programa de Pós-Graduação em Engenharia da Informação, 2014. / Most speaker verifcation systems are based on Gaussian mixture models and more recently on the so-called i-vector. These two methods are affected in mismatched testtrain conditions, which might be caused by vocal-efort variability, different speakingstyles or channel efects. In this work, we compared the impact of speech rate variation and room reverberation on both methods. We found that performance degradation due to variation on speech rate can be mitigated by adding fast speech samples into the training set, which decreased equal error rates for Gaussian mixture models and i-vector, respectively. Regarding reverberation, we investigated the achievements of both methods when three diferent reverberation compensation techniques are applied in order to overcome performance degradation. The results showed that having reverberant background models separated by diferent levels of reverberation can bene t both methods, with the i-vector providing the best performance in that scenario. Finally, the performance of two auditory-inspired features, mel-frequency cepstral coe ficients and the so-called modulation spectrum features, are compared in presence of room reverberation. For the speaker verifcation system considered in this work, modulation spectrum features are equally afected by reverberation time and have their performance degraded as the level of reverberation increases. VERIFICAÇÃO AUTOMÁTICA DE LOCUTOR RITMO DA FALA REVERBERAÇÃO SPEAKER VERIFICATION SPEECH RATE REVERBERATION
2	Estudo da estruturação prosódica de repórteres da TV Universitária - Unicamp antes e após intervenção fonoadiológica / Study of the prosodic organization of announcements of TV broadcasting from the Unicamp Universitary TV before and after broadcast training Constantini, Ana Carolina, 1985- 11 April 2010 (has links) Orientador: Plínio Almeida Barbosa / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Estudos da Linguagem / Made available in DSpace on 2018-08-17T11:33:41Z (GMT). No. of bitstreams: 1 Constantini_AnaCarolina_M.pdf: 2679463 bytes, checksum: 65309a00a02f2cde6ee2a55f9414e50f (MD5) Previous issue date: 2010 / Resumo: O presente trabalho teve como objetivo principal estudar a estruturação rítmica de enunciados de repórteres da TV Universitária - Unicamp que passaram por intervenção fonoaudiológica para aprimoramento vocal. O projeto foi aprovado pelo Comitê de Ética e Pesquisa da FCM (CEP/FCM/UNICAMP 211/2005). O estudo da estruturação rítmica desses enunciados se deu pela análise da evolução dos picos de duração de unidades do tamanho da sílaba (doravante Unidades VV), que compreendem o segmento acústico que vai de um onset vocálico até onset imediatamente seguinte, incluindo as consoantes entre eles (Barbosa, 2006). As unidades VV revelam informações sobre a estruturação rítmica de enunciados (Barbosa, 1994; 1996), fato que justifica o estudo dessas unidades dentro de enunciados telejornalísticos com a finalidade de conhecer a organização rítmica deste tipo de narração. A partir da análise da evolução da duração das Unidades VV no discurso telejornalístico, foram traçados outros objetivos, como a relação da duração com a curva entoacional da leitura do texto, para cuja notação utilizamos o sistema DaTo (Lucente, 2007;2008). Em seguida, foram aplicados testes perceptivos com a finalidade de investigar como as estratégias utilizadas pelos sujeitos estudados eram percebidas pelo ouvinte. Outro ponto estudado foi a relação entre sintaxe e prosódia, a partir da Gramática de Dependência (Tesnière, 1965) e seu uso em dados do português brasileiro (Barbosa 2006). Após a análise dos resultados mostramos que os sujeitos realizam mudanças consistentes em pontos específicos do texto narrado na condição pós-intervenção. As mudanças se caracterizam pelo aumento do valor da duração de unidades VV em posições específicas de acento frasal, aumento da variação da curva da frequência fundamental para realização de ênfases. Essas mesmas ênfases foram julgadas por ouvintes a fim de saber a repercussão das mudanças observadas na análise acústica. Em relação aos dados correlacionando a sintaxe e a prosódia, as marcas de dependência mais encontradas foram IDF (marcador entre fim de sentença e início da próxima sentença) e DfD (marcador encontrado entre governante, um nome, por exemplo, e e governado, por exemplo, um adjetivo posposto). Os resultados obtidos mostram que após o aprimoramento vocal, os sujeitos tornaram-se mais expressivos ao narrarem o texto, melhoraram o conhecimento em relação aos parâmetros prosódicos que podem ser utilizados em suas narrações para compor o fonoestilo que desejam incorporar / Abstract: The objective of this work was the study of the rhythmic organization of announcements of TV broadcast from the Unicamp Universitary TV. Two students of journalism passed by vocal training workshops with a speech therapist. The vocal training workshops intended to improve phonoarticulatory aspects involved in journalistic announcement. The study of the rhythmic organization of the announcements was done by analyzing the duration of Vowel-to-Vowel Units (VV units) throughout the utterances before and after the training. These units reveal the rhythmic organization of announcements in Brazilian Portuguese (henceforth BP) and they include the acoustic segments that start at a vowel onset to the next vowel onset (Barbosa, 1994;1996; 2006). Besides duration, the relationship between the melodic contour and the duration of VV units (using the DaTo notation system by Lucente, 2007; 2008) was investigated too. Perceptual tests were applied to try to reveal how the strategies used by the subjects to read the announcement would be interpreted by the listeners. Another aspect studied was the relationship between prosody and syntax according to the Dependency Grammar (Tesnière, 1965) theory (DG theory) as used by (Barbosa, 2006) to investigate this relationship in BP. Results showed that the duration of specific VV units related to emphatic expression were higher in the utterances produced after training. These changes were located in stressed syllable position where the fundamental frequency also varies more in the case of particular emphasis. As to the relation with syntax, the most common DG marks founded were IDF (between utterances) and DfD (between a head and its right dependent). Results showed that after training the subjects became more expressive to read the announcement and they improve their knowledge about prosodic aspects that are involved in journalistic phonostyle / Mestrado / Linguistica / Mestre em Linguística Prosódia (Linguística) Telejornalismo Fonoestilística Ritmo da fala Treinamento vocal Prosody (Linguistics) TV broadcast Phonoestylístics Speech rhythm Vocal training

Search results

A comparative analysis of gaussian mixture models and i-vector for speaker verification under mismatched conditions

Estudo da estruturação prosódica de repórteres da TV Universitária - Unicamp antes e após intervenção fonoadiológica / Study of the prosodic organization of announcements of TV broadcasting from the Unicamp Universitary TV before and after broadcast training