Global ETD Search

11	Learning language from ambiguous perceptual context Chen, David Lieh-Chiang 05 July 2012 (has links) Building a computer system that can understand human languages has been one of the long-standing goals of artificial intelligence. Currently, most state-of-the-art natural language processing (NLP) systems use statistical machine learning methods to extract linguistic knowledge from large, annotated corpora. However, constructing such corpora can be expensive and time-consuming due to the expertise it requires to annotate such data. In this thesis, we explore alternative ways of learning which do not rely on direct human supervision. In particular, we draw our inspirations from the fact that humans are able to learn language through exposure to linguistic inputs in the context of a rich, relevant, perceptual environment. We first present a system that learned to sportscast for RoboCup simulation games by observing how humans commentate a game. Using the simple assumption that people generally talk about events that have just occurred, we pair each textual comment with a set of events that it could be referring to. By applying an EM-like algorithm, the system simultaneously learns a grounded language model and aligns each description to the corresponding event. The system does not use any prior language knowledge and was able to learn to sportscast in both English and Korean. Human evaluations of the generated commentaries indicate they are of reasonable quality and in some cases even on par with those produced by humans. For the sportscasting task, while each comment could be aligned to one of several events, the level of ambiguity was low enough that we could enumerate all the possible alignments. However, it is not always possible to restrict the set of possible alignments to such limited numbers. Thus, we present another system that allows each sentence to be aligned to one of exponentially many connected subgraphs without explicitly enumerating them. The system first learns a lexicon and uses it to prune the nodes in the graph that are unrelated to the words in the sentence. By only observing how humans follow navigation instructions, the system was able to infer the corresponding hidden navigation plans and parse previously unseen instructions in new environments for both English and Chinese data. With the rise in popularity of crowdsourcing, we also present results on collecting additional training data using Amazon’s Mechanical Turk. Since our system only needs supervision in the form of language being used in relevant contexts, it is easy for virtually anyone to contribute to the training data. / text Natural language processing Natural language learning Connecting language and perception Machine learning Artificial intelligence
12	米飯的感知及其在中文及日文的語言表達 / The Perception of Rice and Its Linguistic Expression in Chinese and Japanese 謝明哲, Hsieh Ming Che Unknown Date (has links) 無庸置疑，人類的每一項知覺都一樣重要。因為我們仰賴這些知覺接收來自周遭的訊息，只要有一個消失不見，我們在生活上便會遇到困難。然而，在語言表達上，知覺並非一樣重要。嗅覺看似是最難表達的知覺，因為人們常常依賴物體來表達嗅覺，例如「草的味道」。另一方面，我們無需依賴物體、並使用顏色來表達視覺。當我們想陳述對於「天空」的想法時，我們會使用「藍色」而不是「天空的顏色」。不同的語言有無可能把重點放在不同的知覺上？如果是，是什麼原因造成不同的語言現象？觀察人們如何表達對食物的看法非常適合用來討論知覺表達，因為品嚐食物的過程和視覺、嗅覺、味覺、及口感息息相關。本研究搜集來自中文及日文母語者有關食物的知覺表達，包括職業廚師及料理新手。米飯在中國及日本文化扮演著主食的角色，這項文化地位讓米飯成為與發音人面談上的主題。假設所有的知覺在不同文化都一樣重樣，人們應該會以相似的方式來表達知覺。研究結果發現在知覺表達中，中文母語者主要強調口感，但日語母語者強調視覺。透過比較職業廚師及料理新手也能找到知覺表達上的差異：職業廚師主要著重視覺，新手則重視味覺。人們亦使用不同的認知策略來表達不同知覺，但職業廚師及新手都依賴對食物的評價來表達知覺。從生理學上來看，人類的知覺一樣重要，但中文及日文的知覺表達卻不一樣。本研究認為文化及社會因素了影響語言的知覺表達。 / Senses are undoubtedly important to people because they allow us to experience our world and we would face difficulties when any of them were absent. However, senses are not equally important in linguistic expressions. It seems that expressing odors is difficult in some languages because people often rely on concrete objects to make olfactory expressions, such as cǎo de weìdào ‘the smell of grass.’ Making use of color rather than concrete objects for visual expressions, we often choose lán ‘blue’ rather than tiānkōng de yánsè ‘the color of sky’ when expressing what we feel from sky. Is it possible that different emphasis of senses can be observed in different languages? If so, what is the reason leading to the differences in languages? Observing how people express their feeling toward food is an appropriate method to discuss sensory expressions, because the procedure of tasting food is strongly correlated with multiple senses like vision, odor, taste, and mouthfeel. This study collects sensory expressions of food from both Chinese and Japanese speakers, including both experts and novices of cooking. Acting as the main dish in Chinese and Japanese cultures, rice is regarded as the theme of interview due to the cultural importance. If all senses are important in different cultures, they should be expressed in similar ways. Our results suggest that Chinese mainly focuses on mouthfeel, while Japanese mainly focuses on vision when performing sensory expressions. The differences in sensory expressions can also be observed through comparing experts with novices: experts mainly focus on vision, and novices firstly choose taste. People also make use of different cognitive strategies to express different kinds of senses, but both experts and novices rely on the evaluative type to create sensory expressions. Sensory expressions are different between Chinese and Japanese although senses are physiologically identical for people. This study suggests that both culture and social factors influence sensory expressions in languages. 知覺語言語言相對論料理語言學 Language of perception Linguistic relativity Culinary linguistics
13	The role of vowel hyperarticulation in clear speech to foreigners and infants Kangatharan, Jayanthiny January 2015 (has links) Research on clear speech has shown that the type of clear speech produced can vary depending on the speaker, the listener and the medium. Although prior research has suggested that clear speech is more intelligible than conversational speech for normal-hearing listeners in noisy environments, it is not known which acoustic features of clear speech are the most responsible for enhanced intelligibility and comprehension. This thesis focused on investigating the acoustic characteristics that are produced in clear speech to foreigners and infants. Its aim was to assess the utility of these features in enhancing speech intelligibility and comprehension. The results of Experiment 1 showed that native speakers produced exaggerated vowel space in natural interactions with foreign-accented listeners compared to native-accented listeners. Results of Experiment 2 indicated that native speakers exaggerated vowel space and pitch to infants compared to clear read speech. Experiments 3 and 4 focused on speech perception and used transcription and clarity rating tasks. Experiment 3 contained speech directed at foreigners and showed that speech to foreign-accented speakers was rated clearer than speech to native-accented speakers. Experiment 4 contained speech directed at infants and showed that native speakers rated infant-directed speech as clearer than clear read speech. In the fifth and final experiment, naturally elicited clear speech towards foreign-accented interlocutors was used in speech comprehension tasks for native and non-native listeners with varying proficiency of English. It was revealed that speech with expanded vowel space improved listeners’ comprehension of speech in quiet and noise conditions. Results are discussed in terms of the Lindblom’s (1990) theory of Hyper and Hypoarticulation, an influential framework of speech production and perception. 401
14	Sound change and social meaning : the perception and production of phonetic change in York, Northern England Lawrence, Daniel January 2018 (has links) This thesis investigates the relationship between social meaning and linguistic change. An important observation regarding spoken languages is that they are constantly changing: the way we speak differs from generation to generation. A second important observation is that spoken utterances convey social as well as denotational meaning: the way we speak communicates something about who we are. How, if at all, are these two characteristics of spoken languages related? Many sociolinguistic studies have argued that the social meaning of linguistic features is central to explaining the spread of linguistic innovations. A novel form might be heard as more prestigious than the older form, or it may become associated with specific social stereotypes relevant to the community in which the change occurs. It is argued that this association between a linguistic variant and social meaning leads speakers to adopt or reject the innovation, inhibiting or facilitating the spread of the change. In contrast, a number of scholars have argued that social meaning is epiphenomenal to many linguistic changes, which are instead driven by an automatic process of convergence in face-to-face interaction. The issue that such arguments raise is that many studies proposing a role of social meaning in the spread of linguistic innovations rely on production data as their primary source of evidence. Observing the variable adoption of innovations across different groups of speakers (e.g. by gender, ethnicity, or socioeconomic status), a researcher might draw on their knowledge of the social history of the community under study to infer the role of social meaning in that change. In many cases, the observed patterns of could equally be explained by the social structure of the community under study, which constrains who speaks to whom. Are linguistic changes facilitated and inhibited by social meaning? Or is it rather the case that social meaning arises as a consequence of linguistic change, without necessarily influencing the change itself? This thesis explores these questions through a study of vocalic change in York, Northern England, focusing on the fronting and diphthongization of the tense back vowels /u/ and /o/. It presents a systematic comparison of the social meanings listeners assign to innovations (captured using perceptual methods), their social attitudes with regard to those meanings (captured through sociolinguistic interviews), and their use of those forms in production (captured through acoustic analysis). It is argued that evidence of a consistent relationship between these factors would support the proposal that social meaning plays a role in linguistic change. The results of this combined analysis of sociolinguistic perception, social attitudes and speech production provide clear evidence of diachronic /u/ and /o/ fronting in this community, and show that variation in these two vowels is associated with a range of social meanings in perception. These meanings are underpinned by the notion of 'Broad Yorkshire' speech, a socially-recognized speech register linked to notions of authentic local identity and social class. Monophthongal /o/, diphthongal /u/, and back variants of both vowels are shown to be associated with this register, implying that a speaker who adopts an innovative form will likely be heard as less 'Broad'. However, there is no clear evidence that speakers' attitudes toward regional identity or social class have any influence on their adoption of innovations, nor that that their ability to recognise the social meaning of fronting in perception is related to their production behaviour. The fronting of /u/ is spreading in a socially-uniform manner in production, unaffected by any social factor tested except for age. The fronting of /o/ is conditioned by social network structure - speakers with more diverse social networks are more likely to adopt the innovative form, while speakers with closer social ties to York are more likely to retain a back variant. These findings demonstrate that York speakers hear back forms of /u/ and /o/ as more 'local' and 'working class' than fronter realizations, and express strong attitudes toward the values and practices associated with regional identity and social class. However, these factors do not appear to influence their adoption of linguistic innovations in any straightforward manner, contrasting the predictions of an account of linguistic change where social meaning plays a central role in facilitating or inhibiting the propagation of linguistic innovations. Based on these results, the thesis argues that many linguistic changes may spread through the production patterns of a speech community without the direct influence of social meaning, and advocates for the combined analysis of sociolinguistic perception, social attitudes and speech production in future work.
15	Teaching mobile robots to use spatial words Dobnik, Simon January 2009 (has links) The meaning of spatial words can only be evaluated by establishing a reference to the properties of the environment in which the word is used. For example, in order to evaluate what is to the left of something or how fast is fast in a given context, we need to evaluate properties such as the position of objects in the scene, their typical function and behaviour, the size of the scene and the perspective from which the scene is viewed. Rather than encoding the semantic rules that define spatial expressions by hand, we developed a system where such rules are learned from descriptions produced by human commentators and information that a mobile robot has about itself and its environment. We concentrate on two scenarios and words that are used in them. In the first scenario, the robot is moving in an enclosed space and the descriptions refer to its motion ('You're going forward slowly' and 'Now you're turning right'). In the second scenario, the robot is static in an enclosed space which contains real-size objects such as desks, chairs and walls. Here we are primarily interested in prepositional phrases that describe relationships between objects ('The chair is to the left of you' and 'The table is further away than the chair'). The perspective can be varied by changing the location of the robot. Following the learning stage, which is performed offline, the system is able to use this domain specific knowledge to generate new descriptions in new environments or to 'understand' these expressions by providing feedback to the user, either linguistically or by performing motion actions. If a robot can be taught to 'understand' and use such expressions in a manner that would seem natural to a human observer, then we can be reasonably sure that we have captured at least something important about their semantics. Two kinds of evaluation were performed. First, the performance of machine learning classifiers was evaluated on independent test sets using 10-fold cross-validation. A comparison of classifier performance (in regard to their accuracy, the Kappa coefficient (κ), ROC and Precision-Recall graphs) is made between (a) the machine learning algorithms used to build them, (b) conditions under which the learning datasets were created and (c) the method by which data was structured into examples or instances for learning. Second, with some additional knowledge required to build a simple dialogue interface, the classifiers were tested live against human evaluators in a new environment. The results show that the system is able to learn semantics of spatial expressions from low level robotic data. For example, a group of human evaluators judged that the live system generated a correct description of motion in 93.47% of cases (the figure is averaged over four categories) and that it generated the correct description of object relation in 59.28% of cases. 006.3

Page generated in 0.1084 seconds