The topic of this thesis is learning through social interaction, consisting of experiments that focus on word acquisition through imitation, and a formalism aiming to provide stronger theoretical foundations. The formalism is designed to encompass essentially any situation where a learner tries to figure out what a teacher wants it to do by interaction or observation. It groups learners that are interpreting a broad range of information sources under the same theoretical framework. A teachers demonstration, it's eye gaze during a reproduction attempt and a teacher speech comment are all treated as the same type of information source. They can all tell the imitator what the demonstrator wants it to do, and they need to be interpreted in some way. By including them all under the same framework, the formalism can describe any agent that is trying to figure out what a human wants it to do. This allows us to see parallels between existing research, and it provides a framing that makes new avenues of research visible. The concept of informed preferences is introduced to deal with cases such as "the teacher would like the learner to perform an action, but if it knew the consequences of that action, would prefer another action" or "the teacher is very happy with the end result after the learner has cleaned the apartment, but if it knew that the cleaning produced a lot of noise that disturbed the neighbors, it would not like the cleaning strategy". The success of a learner is judged according to the informed teachers opinion of what would be best for the uninformed version. A series of simplified setups are also introduced showing how a toy world setup can be reduced to a crisply defined inference problem with a mathematically defined success criteria (any learner architecture-setup pair has a numerical success value). An example experiment is presented where a learner is concurrently estimating the task and what the evaluative comments of a teacher means. This experiment shows how the ideas of learning to interpret information sources can be used in practice. The first of the learning from demonstration experiments presented investigates a learner, specifically an imitator, that can learn an unknown number of tasks from unlabeled demonstrations. The imitator has access to a set of demonstrations, but it must infer the number of tasks and determine what demonstration is of what task (there are no symbols or labels attached to the demonstrations). The demonstrator is attempting to teach the imitator a rule where the task to perform is dependent on the 2D position of an object. The objects 2D position is set at a random location within four different, well separated, rectangles, each location indicating that a specific task should be performed. Three different coordinate systems were available, and each task was defined in one of them (for example ''move the hand to the object and then draw a circle around it"). To deal with this setup, a local version of Gaussian Mixture Regression (GMR) was used called Incremental Local Online Gaussian Mixture Regression (ILO-GMR). A small and fixed number of gaussians are fitted to local data, informs policy, and then new local points are gathered. Three other experiments extends the types of contexts to include the actions of another human, making the investigation of language learning possible (a word is learnt by imitating how the demonstrator responds to someone uttering the word). The robot is presented with a setup containing two humans, one demonstrator (who performs hand movements), and an interactant (who might perform some form of communicative act). The interactants behavior is treated as part of the context and the demonstrators behavior is assumed to be an appropriate response to this extended context. Two experiments explore the simultaneous learning of linguistic and non linguistic tasks (one demonstration could show the appropriate response to an interactant speech utterance and another demonstration could show the appropriate response to an object position). The imitator is not given access to any symbolic information about what word or hand sign was spoken, and must infer how many words where spoken, how many times linguistic information was present, and what demonstrations where responses to what word. Another experiment explores more advanced types of linguistic conventions and demonstrator actions (simple word order grammar in interactant communicative acts, and the imitation of internal cognitive operations performed by the demonstrator as a response). Since a single general imitation learning mechanism can deal with the acquisition of all the different types of tasks, it opens up the possibility that there might not be a need for a separate language acquisition system. Being able to learn a language is certainly very useful when growing up in a linguistic community, but this selection pressure can not be used to explain how the linguistic community arose in the first place. It will be argued that a general imitation learning mechanism is both useful in the absence of language, and will result in language given certain conditions such as shared intentionality and the ability to infer the intentions and mental states of others (all of which can be useful to develop in the absence of language). It will be argued that the general tendency to adopt normative rules is a central ingredient for language (not sufficient, and not necessary while adopting an already established language, but certainly very conducive for a community establishing linguistic conventions).
Identifer | oai:union.ndltd.org:CCSD/oai:tel.archives-ouvertes.fr:tel-00937615 |
Date | 10 December 2013 |
Creators | Cederborg, Thomas |
Publisher | Université Sciences et Technologies - Bordeaux I |
Source Sets | CCSD theses-EN-ligne, France |
Language | French |
Detected Language | English |
Type | PhD thesis |
Page generated in 0.0035 seconds