Spelling suggestions: "subject:"forensic phonetic""
11 |
Spektrální vlastnosti zdrojového signálu jako údaje o identitě mluvčího / Spectral properties of the source signal as speaker-specific cuesVaňková, Jitka January 2012 (has links)
Despite a continuous development in computer sciences and related disciplines, speaker identification remains one of the most challenging tasks in forensic phonetics. The reason for this is the fact that our knowledge of how identity is reflected in the acoustic signal is still limited. The present study aims to contribute to the search of speaker-specific cues by examining spectral properties of the source signal. Specifically, it examines to what extent three short-term measures of spectral tilt, namely H1-H2, H1-A1 and H1-A3, can discriminate 16 Czech female speakers. It also addresses the influence of vowel quality, syllable status with respect to stress and position of stress group in the utterance on the values of these measures. The results show that these parameters do have some discriminative power, though the contribution of individual parameters differs. The study indicates that discrimination of speakers is the most successful in stressed syllables and argues that individual vowels could differ in their usefulness for speaker identification. The results of LDA based on these short- term measures of spectral tilt were complemented with long-term measures, namely alpha index, Kitzing index and Hammarberg index which quantify the slope of the LTAS. The present study suggests that...
|
12 |
Identifikace mluvčího v temporální doméně řeči / Speaker identification in the temporal domain of speechWeingartová, Lenka January 2015 (has links)
This thesis aims to thoroughly describe the temporal characteristics of spoken Czech by means of phone durations and their changes under the influence of several prosodic and segmental factors, such as position in a higher unit (syllable, word or prosodic phrase), length of the higher unit, segmental environment, structure of the syllable or phrase-final lengthening. The speech material comes from a semi-spontaneous corpus of scripted dialogues comprising 4046 utterances by 34 speakers. The descriptions are afterwards used for the creation of a rule-based temporal model, which provides a baseline for analysing local articulation rate contours and their speaker-specificity. The results indicate, that systematic speaker-specific differences can be found in the segmental domain, as well as in the temporal contours. Moreover, speaker identification potential of articulation rate and global temporal features is also assessed. Keywords: temporal characteristics, temporal modelling, phone duration, speaker identification, Czech
|
13 |
Medidas de duração de consoantes oclusivas como vestígios de fala em análise acústico-instrumental forense de amostras com e sem uso de disfarce / Stop consonants term measures as remains in forensic acoustic instrumental analysis of disguised and normal speechCarneiro, Denise de Oliveira 24 August 2016 (has links)
A atribuição de autoria a falas provenientes de gravações ambientais e interceptações telefônicas de falas que provêm de crimes como tráfico de drogas, estelionato, sequestro, abuso sexual, pedofilia, e corrupção pode apresentar à perícia vários dificultadores para a obtenção de medidas acústicas. Um desses dificultadores pode ser o disfarce de voz. Com a possibilidade de ter suas vozes gravadas, o disfarce tem se tornado comum entre os perpetradores de crimes. Quando a voz é gravada, poderá servir como prova a partir do exame de comparação de locutor (ECL), que reúne metodologias para determinar se duas amostras de fala provêm do mesmo falante. O ECL é realizado por meio de análise perceptivo-auditiva, acústico-instrumental e de reconhecimento automático. Embora já tenham sido desenvolvidas tecnologias de verificação automática, as análises sem interferência humana não apresentam respaldo suficiente, seja pela má qualidade do sinal ou pela escassez de amostras de fala em banco de dados e, por isso, as pesquisas que ancorem as outras modalidades de análise são essenciais. A análise acústico-instrumental emprega ferramentas computacionais para avaliação quantitativa e qualitativa da fala e a engenharia biomédica possibilita o desenvolvimento de tecnologias para instrumentação da análise do sinal de fala. Em busca de um parâmetro acústico que seja robusto em análises de disfarce de voz, este trabalho utilizou medidas de duração de fases de segmentos, que têm sido pouco exploradas em ECL. As consoantes oclusivas não vozeadas do português brasileiro [p, t, k] são produzidas em três fases distintas: fase de oclusão, fase de soltura e transição formântica. As duas primeiras fases apresentam correlatos acústicos que se destacam na visualização do oscilograma: silêncio relativo e produção de ondas aperiódicas. Nesta pesquisa, foram analisadas instrumentalmente as falas de 20 sujeitos, 10 do sexo masculino e 10 do sexo feminino, com idades entre 25 e 55 anos, durante a leitura, com e sem o uso de disfarce, de um texto que simulava uma situação criminosa. Foram obtidas medidas dos tempos de oclusão e soltura das consoantes não vozeadas e constatou-se que o contexto fonológico posterior influencia o tempo de produção. Verificaram-se medidas diferentes entre a primeira e a segunda leitura com uso do disfarce, indicando que o falante apresentou dificuldade na manutenção do ajuste fonatório e que, embora tenham sido encontradas diferenças entre as medidas obtidas em fala com e sem disfarce, a correlação é forte entre as mesmas. O tempo de oclusão aparentou comportamento menos influenciável pelo uso do disfarce para as sílabas [pi, pu, te, tɛ], enquanto o tempo de soltura demonstrou maior suscetibilidade, exceto em [pi, te]. Os resultados permitem que alguns dos segmentos analisados sejam considerados vestígios de autoria dentro de um conjunto probatório. / Authorship attribution of speech, from environmental recordings and telephone interceptions, which can be evidence of crimes related to drug dealing, racketeering, kidnapping, sexual abuse, pedophilia, and corruption, may present difficulties to experts in obtaining acoustic measures. One of these difficulties may be the use of disguise. With the possibility of being recorded, voice disguise has become common among crime perpetrators. When a voice is recorded, it can be an evidence after speaker comparison examination (SCE), which adopts methodologies to determine whether two speech samples have been produced by the same speaker. SCE can be perceptual, acoustic-instrumental and through automatic recognition. Although automatic recognition technologies have already been developed, analyses without human interference do not have enough support, both for poor signal quality or for lack of speech samples. Therefore, research that is anchored in other analysis methods are essential. Acoustic-instrumental analyses use computational tools for quantitative and qualitative evaluation of speech, and biomedical engineering enables the development of technologies and instrumentation for speech signal analyses. In search of an acoustic parameter that is robust in disguise analyses, this research used segment phase measurements, which have been little explored in SCE. Brazilian Portuguese voiceless stops [p, t, k] are produced in three distinct phases: occlusion phase, release phase and formant transition. The first two phases have acoustic correlates that are visually distinct in the oscillogram: relative silence and aperiodic wave production. In this research, the speech of 20 subjects were instrumentally analyzed, 10 males and 10 females, aged between 25 and 55 years, while reading, with and without the use of disguise, a text that simulated a criminal situation. Occlusion and release duration of the voiceless stop consonants were measured and it was found that the phonological context influences the production time. Different measures were found between the first and the second reading with disguised voice, indicating that the speaker had difficulty in maintaining the phonation setting, and that, although there were differences between the measures in speech with and without disguise, the correlation between them was strong. Occlusion time appeared to be less influenced by the use of disguise for the syllables [pi, pu, te, tɛ], while the release time showed greater susceptibility, except for [pi, te]. The results allow that some of the segments analyzed be considered vestiges of authorship within a body of evidence.
|
14 |
Medidas de duração de consoantes oclusivas como vestígios de fala em análise acústico-instrumental forense de amostras com e sem uso de disfarce / Stop consonants term measures as remains in forensic acoustic instrumental analysis of disguised and normal speechCarneiro, Denise de Oliveira 24 August 2016 (has links)
A atribuição de autoria a falas provenientes de gravações ambientais e interceptações telefônicas de falas que provêm de crimes como tráfico de drogas, estelionato, sequestro, abuso sexual, pedofilia, e corrupção pode apresentar à perícia vários dificultadores para a obtenção de medidas acústicas. Um desses dificultadores pode ser o disfarce de voz. Com a possibilidade de ter suas vozes gravadas, o disfarce tem se tornado comum entre os perpetradores de crimes. Quando a voz é gravada, poderá servir como prova a partir do exame de comparação de locutor (ECL), que reúne metodologias para determinar se duas amostras de fala provêm do mesmo falante. O ECL é realizado por meio de análise perceptivo-auditiva, acústico-instrumental e de reconhecimento automático. Embora já tenham sido desenvolvidas tecnologias de verificação automática, as análises sem interferência humana não apresentam respaldo suficiente, seja pela má qualidade do sinal ou pela escassez de amostras de fala em banco de dados e, por isso, as pesquisas que ancorem as outras modalidades de análise são essenciais. A análise acústico-instrumental emprega ferramentas computacionais para avaliação quantitativa e qualitativa da fala e a engenharia biomédica possibilita o desenvolvimento de tecnologias para instrumentação da análise do sinal de fala. Em busca de um parâmetro acústico que seja robusto em análises de disfarce de voz, este trabalho utilizou medidas de duração de fases de segmentos, que têm sido pouco exploradas em ECL. As consoantes oclusivas não vozeadas do português brasileiro [p, t, k] são produzidas em três fases distintas: fase de oclusão, fase de soltura e transição formântica. As duas primeiras fases apresentam correlatos acústicos que se destacam na visualização do oscilograma: silêncio relativo e produção de ondas aperiódicas. Nesta pesquisa, foram analisadas instrumentalmente as falas de 20 sujeitos, 10 do sexo masculino e 10 do sexo feminino, com idades entre 25 e 55 anos, durante a leitura, com e sem o uso de disfarce, de um texto que simulava uma situação criminosa. Foram obtidas medidas dos tempos de oclusão e soltura das consoantes não vozeadas e constatou-se que o contexto fonológico posterior influencia o tempo de produção. Verificaram-se medidas diferentes entre a primeira e a segunda leitura com uso do disfarce, indicando que o falante apresentou dificuldade na manutenção do ajuste fonatório e que, embora tenham sido encontradas diferenças entre as medidas obtidas em fala com e sem disfarce, a correlação é forte entre as mesmas. O tempo de oclusão aparentou comportamento menos influenciável pelo uso do disfarce para as sílabas [pi, pu, te, tɛ], enquanto o tempo de soltura demonstrou maior suscetibilidade, exceto em [pi, te]. Os resultados permitem que alguns dos segmentos analisados sejam considerados vestígios de autoria dentro de um conjunto probatório. / Authorship attribution of speech, from environmental recordings and telephone interceptions, which can be evidence of crimes related to drug dealing, racketeering, kidnapping, sexual abuse, pedophilia, and corruption, may present difficulties to experts in obtaining acoustic measures. One of these difficulties may be the use of disguise. With the possibility of being recorded, voice disguise has become common among crime perpetrators. When a voice is recorded, it can be an evidence after speaker comparison examination (SCE), which adopts methodologies to determine whether two speech samples have been produced by the same speaker. SCE can be perceptual, acoustic-instrumental and through automatic recognition. Although automatic recognition technologies have already been developed, analyses without human interference do not have enough support, both for poor signal quality or for lack of speech samples. Therefore, research that is anchored in other analysis methods are essential. Acoustic-instrumental analyses use computational tools for quantitative and qualitative evaluation of speech, and biomedical engineering enables the development of technologies and instrumentation for speech signal analyses. In search of an acoustic parameter that is robust in disguise analyses, this research used segment phase measurements, which have been little explored in SCE. Brazilian Portuguese voiceless stops [p, t, k] are produced in three distinct phases: occlusion phase, release phase and formant transition. The first two phases have acoustic correlates that are visually distinct in the oscillogram: relative silence and aperiodic wave production. In this research, the speech of 20 subjects were instrumentally analyzed, 10 males and 10 females, aged between 25 and 55 years, while reading, with and without the use of disguise, a text that simulated a criminal situation. Occlusion and release duration of the voiceless stop consonants were measured and it was found that the phonological context influences the production time. Different measures were found between the first and the second reading with disguised voice, indicating that the speaker had difficulty in maintaining the phonation setting, and that, although there were differences between the measures in speech with and without disguise, the correlation between them was strong. Occlusion time appeared to be less influenced by the use of disguise for the syllables [pi, pu, te, tɛ], while the release time showed greater susceptibility, except for [pi, te]. The results allow that some of the segments analyzed be considered vestiges of authorship within a body of evidence.
|
15 |
Využití dlouhodobé formantové distribuce pro rozpoznatelnost mluvčího v různých akustických podmínkách / Using long-term formant distributions for speaker identification in various acoustic conditionsLazárková, Dita January 2015 (has links)
The analysis of long-time formant distribution is relatively young but promising discipline of speaker identification. It is a method of mapping the long-term behavior of formants in speech of individual speakers. Frequently encountered problems in practice are bad acoustic quality and very short duration of analyzed recordings. This work aims to present the historical development of forensic phonetics and currently used methods. In the practical part, it deals with the usability of LTF method in forensic practice, especially in recordings containing background noise. It was shown that the noise appreciably affects extracted LTF values and unfortunately the change is not systematic. Therefore, we proposed several methods to compensate the noise in recordings, in order to be able to compare recordings with and without noise. We also investigated the minimum duration of recording, which is necessary for statistical reliability of the resulting values. This boundary is not exact and for each speaker, it is substantially individual. But it is apparent that recordings (vocalic streams) shorter than 15 s often provide incomplete information, wherefore they cannot be recommended for analysis. Keywords: LTF, long-time formant distribution, speaker identification, forensic phonetics, acoustic quality of...
|
16 |
O efeito do telefone celular no sinal da fala : uma análise fonético-acústica com implicações para a verificação de locutor em português brasileiro / The mobile phone effect over the speech signal : an acoustic-phonetic analysis with implications for speaker verification in Brazilian PortuguesePassetti, Renata Regina, 1981- 27 August 2018 (has links)
Orientador: Plínio Almeida Barbosa / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Estudos da Linguagem / Made available in DSpace on 2018-08-27T03:40:06Z (GMT). No. of bitstreams: 1
Passetti_RenataRegina_M.pdf: 2198292 bytes, checksum: 75f3471d8eeffbfb0346d7705e4ea136 (MD5)
Previous issue date: 2015 / Resumo: Esta dissertação avalia os efeitos causados ao sinal da fala pela transmissão telefônica de linhas móveis e, com isso, busca determinar o grau de modificação fonético-acústica intralocutor causado pelo filtro de banda do canal telefônico à voz habitual e os efeitos que a transmissão telefônica exerce sobre as vogais orais do português brasileiro, pelo estudo de parâmetros acústicos que são afetados por esse tipo de transmissão. As análises investigaram quais características acústicas eram modificadas e quais permaneciam inalteradas na fala de indivíduos diante da utilização de telefones celulares quando comparadas a gravações diretas. O corpus constitui-se de gravações de 10 locutores do sexo masculino, realizadas de forma simultânea nas condições via celular e direta, pelo posicionamento de um microfone em frente aos sujeitos enquanto falavam ao celular. As vogais orais do português brasileiro foram transcritas e segmentadas e, posteriormente, foi utilizado o script ForensicDataTrecking para extração automática das seguintes classes de parâmetros: frequência dos três primeiros formantes (F1, F2 e F3), frequência fundamental (F0), ênfase espectral, frequência de base da fundamental (baseline) e duração interpicos de F0 presentes no discurso. Foram conduzidas análises acústicas com o objetivo investigar os efeitos da transmissão telefônica sobre as vogais orais do português brasileiro, sobre os locutores e no espaço vocálico dos locutores. As análises foram validadas estatisticamente. Para a análise do efeito da transmissão telefônica sobre as vogais orais do português brasileiro, os resultados revelam alterações nas frequências do primeiro e o terceiro formante de, aproximadamente, 14%, na condição telefônica. Em relação às frequências do segundo formante, os resultados da análise de dispersão mostraram que a transmissão telefônica agiu de forma a aumentar artificialmente as frequências de vogais com baixos valores de F2 e a diminuir as frequências de vogais com altos valores de F2. Dos parâmetros acústicos investigados na análise dos efeitos da transmissão telefônica sobre os locutores, apenas a baseline e a duração interpicos de F0 não apresentaram diferenças estatisticamente significativas entre as duas condições de gravação, indicando robustez aos efeitos da transmissão telefônica e podendo ser considerados como parâmetros eficazes na análise forense. Esta análise revelou, também, que a transmissão telefônica agia de maneira distinta nos sujeitos, o que permitiu que fossem agrupados a depender do parâmetro investigado. A análise do efeito telefônico no espaço vocálico dos sujeitos complementou os resultados das análises anteriores. De modo geral, observou-se um abaixamento global do espaço vocálico na gravação telefônica, influenciado pelo aumento nas frequências de F1. A diminuição dos valores de F2 para as vogais anteriores e o aumento nos valores deste formante para vogais posteriores comprimiu o espaço vocálico da maioria dos sujeitos. As modificações nas disposições das vogais têm implicações perceptuais, uma vez que o abaixamento e redução do espaço vocálico fizeram com que as vogais se situassem proximamente a regiões centrais, podendo soar como mais abertas no telefone celular / Abstract: This dissertation evaluates the effects to speech signal due to telephone transmission of mobile phones and seeks to determine the degree of intra-speaker acoustic-phonetic modification caused by the mobile phone band-pass filter to the speech signal and the telephone transmission effects over the Brazilian Portuguese oral vowels by the study of the acoustic parameters affected by this kind of transmission. The analysis investigated which are the acoustic cues which are modified and which cues remain undifferentiated in the speaker's speech by the use of a mobile phone in comparison to direct recordings. The corpus used consists of simultaneous recordings of 10 male speakers in two conditions: via mobile phone and face-to-face, by placing a microphone directly in front of the subjects. The Brazilian Portuguese oral vowels were segmented and transcribed and the ForensicDataEvaluator script was used to automatically extract the following acoustic parameters: three first formants frequencies (F1, F2 and F3), median of fundamental frequency (F0), spectral emphasis, fundamental frequency baseline and F0 inter-peaks duration. The acoustic analyses aimed at investigating the telephone transmission effects over the Brazilian Portuguese oral vowels, over the speakers and at the speakers¿ vowel space. The analyses were supported statistically. The analysis of the telephone transmission effect over the Brazilian Portuguese oral vowels showed variations of 14% in the frequencies of the first and the third formants. The analysis of the scatter plot of F2 values showed that the mobile phone band-pass filtering has an effect of shifting upwards vowels with low values of F2 and shifting downwards vowels with high values of F2. For the analysis of the telephone transmission effects over the speaker only the acoustic parameters "fundamental frequency baseline" and "F0 inter-peaks duration" did not show any difference statistically significant between the two recording conditions, demonstrating robustness to the telephone transmission effects, which make them able to be considered as powerful parameters for forensic analysis. This analysis also revealed that the telephone transmission affects the speakers in different ways, which set them into different groups of speakers depending on the parameter analyzed. The analysis of the telephone effect in the speakers¿ vowel space shed some light on the previous analyses. In general, the increase of the F1 values in the mobile phone situation caused a global downward displacement of the vowel space. The decrease of the F2 values for the front vowels and the increase of the values of this formant for back vowels reduced the area of the vowel space for the most of the subjects. The vowels rearrangement at the vowel space in the telephone situation has some perceptual implications, since the lowering and reduction of the vowel space made the vowels be placed at its center, which could result in hearing these vowels as more open over the mobile phone / Mestrado / Linguistica / Mestra em Linguística
|
17 |
Método para reconhecimento de vogais e extração de parâmetros acústicos para analises forenses / Method for recognition of vowels and extraction of acoustic parameters for forensic analysisDresch, Andrea Alves Guimarães 14 December 2015 (has links)
Exames de Comparação Forense de Locutores apresentam características complexas, demandando análises demoradas quando realizadas manualmente. Propõe-se um método para reconhecimento automático de vogais com extração de características para análises acústicas, objetivando-se contribuir com uma ferramenta de apoio nesses exames. A proposta baseia-se na medição dos formantes através de LPC (Linear Predictive Coding), seletivamente por detecção da frequência fundamental, taxa de passagem por zero, largura de banda e continuidade, sendo o agrupamento das amostras realizado por meio do método k-means. Experimentos realizados com amostras de três diferentes bases de dados trouxeram resultados promissores, com localização das regiões correspondentes a cinco das vogais do Português Brasileiro, propiciando a visualização do comportamento do trato vocal de um falante, assim como detecção de trechos correspondentes as vogais-alvo. / Forensic speaker comparison exams have complex characteristics, demanding a long time for manual analysis. A method for automatic recognition of vowels, providing feature extraction for acoustic analysis is proposed, aiming to contribute as a support tool in these exams. The proposal is based in formant measurements by LPC (Linear Predictive Coding), selectively by fundamental frequency detection, zero crossing rate, bandwidth and continuity, with the clustering being done by the k-means method. Experiments using samples from three different databases have shown promising results, in which the regions corresponding to five of the Brasilian Portuguese vowels were successfully located, providing visualization of a speaker’s vocal tract behavior, as well as the detection of segments corresponding to target vowels.
|
18 |
Método para reconhecimento de vogais e extração de parâmetros acústicos para analises forenses / Method for recognition of vowels and extraction of acoustic parameters for forensic analysisDresch, Andrea Alves Guimarães 14 December 2015 (has links)
Exames de Comparação Forense de Locutores apresentam características complexas, demandando análises demoradas quando realizadas manualmente. Propõe-se um método para reconhecimento automático de vogais com extração de características para análises acústicas, objetivando-se contribuir com uma ferramenta de apoio nesses exames. A proposta baseia-se na medição dos formantes através de LPC (Linear Predictive Coding), seletivamente por detecção da frequência fundamental, taxa de passagem por zero, largura de banda e continuidade, sendo o agrupamento das amostras realizado por meio do método k-means. Experimentos realizados com amostras de três diferentes bases de dados trouxeram resultados promissores, com localização das regiões correspondentes a cinco das vogais do Português Brasileiro, propiciando a visualização do comportamento do trato vocal de um falante, assim como detecção de trechos correspondentes as vogais-alvo. / Forensic speaker comparison exams have complex characteristics, demanding a long time for manual analysis. A method for automatic recognition of vowels, providing feature extraction for acoustic analysis is proposed, aiming to contribute as a support tool in these exams. The proposal is based in formant measurements by LPC (Linear Predictive Coding), selectively by fundamental frequency detection, zero crossing rate, bandwidth and continuity, with the clustering being done by the k-means method. Experiments using samples from three different databases have shown promising results, in which the regions corresponding to five of the Brasilian Portuguese vowels were successfully located, providing visualization of a speaker’s vocal tract behavior, as well as the detection of segments corresponding to target vowels.
|
19 |
Traitement neuronal des voix et familiarité : entre reconnaissance et identification du locuteurPlante-Hébert, Julien 12 1900 (has links)
La capacité humaine de reconnaitre et d’identifier de nombreux individus uniquement grâce à leur voix est unique et peut s’avérer cruciale pour certaines enquêtes. La méconnaissance de cette capacité jette cependant de l’ombre sur les applications dites « légales » de la phonétique. Le travail de thèse présenté ici a comme objectif principal de mieux définir les différents processus liés au traitement des voix dans le cerveau et les paramètres affectant ce traitement.
Dans une première expérience, les potentiels évoqués (PÉs) ont été utilisés pour démontrer que les voix intimement familières sont traitées différemment des voix inconnues, même si ces dernières sont fréquemment répétées. Cette expérience a également permis de mieux définir les notions de reconnaissance et d’identification de la voix et les processus qui leur sont associés (respectivement les composantes P2 et LPC). Aussi, une distinction importante entre la reconnaissance de voix intimement familières (P2) et inconnues, mais répétées (N250) a été observée.
En plus d’apporter des clarifications terminologiques plus-que-nécessaires, cette première étude est la première à distinguer clairement la reconnaissance et l’identification de locuteurs en termes de PÉs. Cette contribution est majeure, tout particulièrement en ce qui a trait aux applications légales qu’elle recèle.
Une seconde expérience s’est concentrée sur l’effet des modalités d’apprentissage sur l’identification de voix apprises. Plus spécifiquement, les PÉs ont été analysés suite à la présentation de voix apprises à l’aide des modalités auditive, audiovisuelle et audiovisuelle interactive. Si les mêmes composantes (P2 et LPC) ont été observées pour les trois conditions d’apprentissage, l’étendue de ces réponses variait. L’analyse des composantes impliquées a révélé un « effet d’ombrage du visage » (face overshadowing effect, FOE) tel qu’illustré par une réponse atténuée suite à la présentation de voix apprise à l’aide d’information audiovisuelle par rapport celles apprises avec dans la condition audio seulement. La simulation d’interaction à l’apprentissage à quant à elle provoqué une réponse plus importante sur la LPC en comparaison avec la condition audiovisuelle passive.
De manière générale, les données rapportées dans les expériences 1 et 2 sont congruentes et indiquent que la P2 et la LPC sont des marqueurs fiables des processus de reconnaissance et d’identification de locuteurs. Les implications fondamentales et en phonétique légale seront discutées. / The human ability to recognize and identify speakers by their voices is unique and can be critical in criminal investigations. However, the lack of knowledge on the working of this capacity overshadows its application in the field of “forensic phonetics”. The main objective of this thesis is to characterize the processing of voices in the human brain and the parameters that influence it.
In a first experiment, event related potentials (ERPs) were used to establish that intimately familiar voices are processed differently from unknown voices, even when the latter are repeated. This experiment also served to establish a clear distinction between neural components of speaker recognition and identification supported by corresponding ERP components (respectively the P2 and the LPC). An essential contrast between the processes underlying the recognition of intimately familiar voices (P2) and that of unknown but previously heard voices (N250) was also observed.
In addition to clarifying the terminology of voice processing, the first study in this thesis is the first to unambiguously distinguish between speaker recognition and identification in terms of ERPs. This contribution is major, especially when it comes to applications of voice processing in forensic phonetics.
A second experiment focused more specifically on the effects of learning modalities on later speaker identification. ERPs to trained voices were analysed along with behavioral responses of speaker identification following a learning phase where participants were trained on voices in three modalities : audio only, audiovisual and audiovisual interactive.
Although the ERP responses for the trained voices showed effects on the same components (P2 and LPC) across the three training conditions, the range of these responses varied. The analysis of these components first revealed a face overshadowing effect (FOE) resulting in an impaired encoding of voice information. This well documented effect resulted in a smaller LPC for the audiovisual condition compared to the audio only condition. However, effects of the audiovisual interactive condition appeared to minimize this FOE when compared to the passive audiovisual condition.
Overall, the data presented in both experiments is generally congruent and indicate that the P2 and the LPC are reliable electrophysiological markers of speaker recognition and identification. The implications of these findings for current voice processing models and for the field of forensic phonetics are discussed.
|
Page generated in 0.0949 seconds