Return to search

Perceptual normalization of inter- and intra-talker variations in tone categorization: 声调感知中话者间及话者内差异的归一化. / 声调感知中话者间及话者内差异的归一化 / Perceptual normalization of inter- and intra-talker variations in tone categorization: Sheng diao gan zhi zhong hua zhe jian ji hua zhe nei cha yi de gui yi hua. / Sheng diao gan zhi zhong hua zhe jian ji hua zhe nei cha yi de gui yi hua

人类如何在颜色、视觉对象和听觉对象存在很大差异的情况下实现感知恒定,这是认知神经科学的一个根本问题。大脑处理差异的一个重要办法是依靠背景环境,为感知颜色、视觉对象和听觉对象提供参照。在语音感知中,在言语信号中存在话者间以及话者内差异的情况下实现语音恒定也是很关键的。根据语境归一化机制,听者可以通过语境(即目标语音周围的其他语音)适应话者的语音空间。在本论文中,我以声调为例考察了语音恒定的问题。 / 第一,在一个跨语言研究中,我发现不同语音系统的结构会影响听者听辨多个话者所发的声调。普通话听者可以不靠语境准确地听辨多个话者所发的声调,而广东话听者则被多个话者间的音高差异误导。这一不同可以归根于广东话声调系统中存在多个调型相同的平调。这一发现有助理解不同语音系统的结构对于话者差异的影响的抵抗力。 / 第二,我发现而广东话听者需要语境以估计某一话者的单调范围来帮助判断平调。带有相同基频信息的言语语境和非言语语境的作用不同。非言语语境的作用很小,而言语语境,无论是否有语义,都有作用,不过有语义的言语语境作用更大。 / 关于语境归一化的神经基础,我最早在N400 时间窗(250-500 毫秒)发现了归一化效应。这说明言语语境提供话者的音高参照以准确地分析词的语音特征,帮助词义提取。当归一化以一种自上而下的方式实施时,归一化的发生不晚于音素加工阶段(PMN,220-350 毫秒)。这些探索性的EEG 研究是最早考察语境归一化的神经基础的研究。 / 第三,我提出了一个关于语音在大脑中表征的混合模型,以整合两种对立的观点。这个模型中,低层表征是实际听到的不同话者所说的语音的例子,高层表征则是反映不同话者间语音相似性的更为抽象的表征。我找到了一些初步的支持证据,比如语音辨认的准确性与话者的音高在群体分布中的典型性有显著相关。这说明高层的表征形式是由一个语言社区中话者音高的总体分布所决定的。这一模型需要被进一步检验,比如通过语言习得研究,考察习得新语音范畴时低层和高层表征的动态发展。 / 总之,本论文对于理解语音系统的结构有帮助,并且阐明了语境归一化的机制和神经基础以及语音的多层表征形式。但是还有很多未解决的问题有待进一步考察。 / How humans achieve constancy in the perception of color, visual object and auditory object despite the tremendous variation is a fundamental question in cognitive neuroscience. An important way that the brain tackles variation is through reliance on the context, which provides a reference for the perception of an object. In speech perception, it is critical to achieve phonetic constancy above the inter- and intra-talker variation in speech signals. According to the context-dependent normalization mechanism, listeners adapt to a talker’s phonetic space via the context (i.e., neighboring sounds of a speech sound to be recognized). This contextually built phonetic space serves as a reference for compensating for talker variation. In this thesis, I have examined the question of phonetic constancy using lexical tones as a case study. / Firstly, in a cross-linguistic study, I found that the structure of phonological inventories influences categorization of multi-talker tone stimuli. Mandarin listeners correctly categorized multi-talker stimuli without contexts, whereas Cantonese listeners were misguided by acoustic variation between talkers, a difference attributable to the existence of multiple level tones with a similar F0 contour in Cantonese. This finding has implications for understanding the structure of phonological inventories in terms of resistance to talker variability. / Secondly, I found that Cantonese listeners could resolve the ambiguity of level tones by adapting to talker-specific pitch references via a context. Speech and nonspeech contexts contribute unequally to talker adaptation. Nonspeech contexts have a minimal effect, whereas speech contexts, no matter meaningful or not, facilitate adaptation, but congruent semantic content further enlarges the facilitatory effect. / As for the neural locus of context-dependent normalization, I found normalization effects in the N400 time-window (250-500 ms). It indicates that speech contexts facilitate retrieval of semantic memory, by providing talker-specific references to accurately assess the phonetic property of a word. When implemented in a top-down manner, context-dependent normalization occurs no later than the phonemic level of processing (Phonological Mapping Negativity, 220-350 ms). These EEG studies, though exploratory, are among the first to examine the neural processes of context-dependent normalization. / Thirdly, I proposed a hybrid model of mental representations to reconcile two opposite views. In this model, at a lower level are encountered exemplars of speech sounds from different talkers, and at a higher level are abstract representations that reflect the general similarity of speech sounds across talkers. I found initial evidence for this model, such as a significant correlation between the identification accuracy and the typicality of a talker’s pitch range in the population distribution, which suggests that higher-level representations are shaped by the global distribution of talkers’ vocal characteristics in a community. This model needs to be further tested in studies on language learning to examine dynamic development of talker-specific and abstract representations for new phonological categories. / In conclusion, this thesis has implications for understanding the structure of phonological inventories in the world’s languages; it also sheds light on mechanisms and neural processes of context-dependent normalization and the hybrid nature of mental representations. Many unresolved questions remain to be examined in future studies. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Zhang, Caicai. / Thesis (Ph.D.) Chinese University of Hong Kong, 2014. / Includes bibliographical references (leaves 173-194). / Abstracts also in Chinese. / Zhang, Caicai.

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_1077701
Date January 2014
ContributorsZhang, Caicai (author.), Wang, William S.-Y. , 1933- (thesis advisor.), Chinese University of Hong Kong Graduate School. Division of Linguistics, (degree granting institution.)
Source SetsThe Chinese University of Hong Kong
LanguageEnglish, Chinese
Detected LanguageEnglish
TypeText, bibliography, text
Formatelectronic resource], electronic resource, remote, 1 online resource (xvii, 194 leaves) : illustrations (some color), computer, online resource
RightsUse of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0029 seconds