Spelling suggestions: "subject:"focalisation"" "subject:"ocalisation""
1 |
Systematics of the Southern African larks (alaudidae) : syringeal and vocalisation perspectiveNthangeni, Aluwani January 2021 (has links)
Thesis (M. Sc. (Zoology)) -- University of Limpopo, 2021 / The larks (Passeriformes, Passeri, Alaudidae) are small to medium-sized (10-23 cm)
birds that are primarily terrestrial and cryptically plumaged hence they are difficult to
encounter and recognise. The current taxonomic circumscription places these birds in
a group that is comprised of 21 genera and 98 species, with all the genera occurring
in Africa, 13 in Eurasia, and a single genus occurs in Australia and the Americas. Up
until Alström et al. (2013), morphologically, the lark family was distinguished by having
two unique and primitive features: i) the tarsus morphology (latiplantar and scutellate)
consisting of the flat posterior surface covered with prominent scales, instead of being
narrow and smooth as in other families, and ii) the syrinx (voice-generating organ).
Despite that the structure of the syrinx of larks has been studied, literature reveals
confusion pertaining to either the presence or absence of the pessulus, its level of
development and size. To date, the work in Alström et al. (2013) remains the most
comprehensive multi-locus phylogeny of the larks in which three strongly supported
major clades (clade A – hereafter the Alaudid, clade B – the Mirafrid, clade C – the
Ammomanid) emerged though with some uncertainty in some parts of the tree. In this
study, the aim was to investigate the utility of syringeal and vocal characters in
classifying the species of larks, finding out how syringeal and vocal characters evolved
and identifying characters that define clades. The gross morphology and histology of
the syringes and song strophes of larks and their putative outgroups were studied.
Gross morphologically and histologically, the larks were found to possess a
typical syrinx classified as a ‘syrinx tracheo-bronchialis’ and pessulus was observed
in larks and the outgroups studied. There were differences observed in the syringeal
gross morphological structure across all the three major clades (A, B and C). This is
with regard to the presence or absence of the divided or double bronchial rings variably
observed in clade A, B and C. In clade B and C, the ossification is variably restricted
to the centre of bronchial rings forming a serial pattern while in clade A, bronchial rings
are variably almost fully ossified without forming any serial pattern. The prominent
oblique muscle-like structure runs ventrally and it was only observed in clade C in
Chersomanes albofasciata. On the other hand, the syringeal histology revealed
differences in the shape of the pessulus (blunt, pointy or sharp), the pessulus position
relative to bronchial rings 1, 2 and 3 (B1, B2 and B3 respectively), length of the internal tympaniform membranes and connective tissue along the internal tympaniform
membrane. The position of the pessulus was variably found to align with B2, to be
below B2 and to be positioned beyond B2. One-way Anova clearly showed that among
the three clades (A, B and C) identified in Alström et al. (2013), a statistically highly
significant difference (P < 0.01) was found between the song strophes of species in
clade C and A. The species in clade A generally give song strophes defined by high
maximum frequency, high peak frequency and broad bandwidth frequency. The
species in clade B have a similar trend with those in clade A, possibly explaining the
overlap between these clades and the statistically significantly difference between
clade A and C. These findings may be in support of the phylogenetic findings in
Alström et al. (2013) and this study wherein clade A and B shared a sister relationship
while clade C was placed basally. Clade C, on the other hand, comprises song
strophes that are defined by low maximum frequency, lower peak frequency and
narrow bandwidth frequency and this clade differed significantly from clade A. Despite
that not all of the species could be correctly classified to their respective clades based
on the Discriminant Function Analysis’ partition plot, the largest number of correct
classifications were for clade A (70%). In addition, the distinction among the clades
was also observed in either the presence or the absence of wing clappings in the song
strophes, either being detached from or attached to the song strophes. Clade B is the
only one which was marked by the presence of wing clappings particularly, genus
Mirafra, although they are reported in Chersophilus duponti which belongs to clade A
but not included in this study. With regard to the vocal phylogeny, the topology was
highly unresolved, and no relationships could be inferred. The tracing of the evolution
of characters of eight vocal and five syringeal characters revealed that among the 13
characters for which the ancestral state reconstructions were performed, 12 are
polymorphic that is, they underwent multiple state changes ranging from four to 18.
Most character states were found to plesiomorphous and mainly leading to clades of
which their ancestral nodes were defined largely by autapormorphic and
symplesiomorphic states. These do not assist in explaining how the various characters
evolved. In conclusion, the findings have shed some light concerning the general
syringeal morphology and histological structures of larks, revealed that lark songs are
not suitable for reconstructing the phylogeny, shed light on the evolution of the
selected vocal and syringeal characters as well as identifying characters that define
the three major clades of larks (the Alaudid, Mirafrid and the Ammomanid).
|
2 |
Mother-pup recognition behaviour, pup vocal signatures and allosuckling in the New Zealand fur seal, Arctocephalus forsteriDowell, Sacha January 2005 (has links)
A recognition system is required between pinniped mothers and pups. For otariids this is especially important since females frequently leave their pups for foraging and must reunite on return. Pups must deal with these fasting periods during maternal absence and consequently may attempt to obtain allomaternal care from unrelated females. This research on the New Zealand fur seal (Arctocephalus forsteri) at Ohau Point, Kaikoura, New Zealand, quantified mother-pup recognition behaviour during reunions, individuality of pup calls used by mothers to recognise their pup, and the occurrence of allosuckling as a possible recognition error by females and as a strategy employed by pups to gain allomaternal care during their mothers' absence. A combination of behavioural observations, morphometry, VHF radio telemetry, acoustics and DNA genotyping were employed to study these topics. Postpartum interaction behaviours between mothers and pups appeared to facilitate development of an efficient mother-pup recognition system, involving mainly vocal and olfactory cues that were utilised during reunions. Greater selective pressure on pups to reunite resulted in an asymmetry of searching behaviour between females and pups during reunions. The vocalisations of pups were stereotypic, especially those features of the fundamental frequency and frequency of the lowest harmonic, which are likely to facilitate recognition of a pup by their mother. Pups attempted to steal milk from unrelated females more often during maternal absence and appeared to modify the intra-individual variation pattern of a feature of their vocal signatures over this period, which may assist attempts at allosuckling under nutritional stress. Fostering was demonstrated to occur despite costs to filial pups and possible costs to female reproductive success and may be attributed to development of erroneous recognition between females and non filial pups, or kin selection. This study provides a valuable contribution to the knowledge of recognition systems between pinniped mothers and pups, of alternative pup strategies under nutritional stress and of the rare occurrence of fostering in otariid pinnipeds.
|
3 |
Le lien perception-production en voix chantée : place des représentations motricesLévêque, Yohana 14 December 2012 (has links)
Un nombre croissant d'études révèle combien les processus cérébraux de production et de perception de l'action sont intriqués. En particulier, on sait maintenant que la perception de la parole induit l'activation de représentations motrices articulatoires chez l'auditeur. Dans ce travail, nous explorons la perception de la voix chantée, une action vocale non-linguistique. L'écoute d'une voix chantée provoque-t-elle une activation du système moteur ? Cette activité motrice est-elle plus forte pour la voix que pour un son musical non-biologique ? Ces questions sont abordées en utilisant de façon complémentaire deux protocoles comportementaux, une technique de lésion virtuelle par stimulation magnétique transcrâniale, l'étude des oscillations en EEG et celle de la variabilité métabolique en IRMf. Nos résultats montrent que la perception d'une voix chantée est effectivement associée à une activité du cortex sensorimoteur dans des tâches de répétition et de discrimination. De façon intéressante, les plus mauvais chanteurs ont montré la plus forte résonance motrice. Le système moteur pourrait, par la génération de modèles internes, faciliter le traitement des stimuli ou la préparation de la réponse vocale quand le traitement acoustique seul est insuffisant. L'ensemble des résultats présentés ici suggère que les interactions audiomotrices en perception de la voix humaine sont modulées par la dimension biologique du son et par le niveau d'expertise vocale des auditeurs. / A growing body of research reveals that action production and action perception interact. In particular, it has been shown that speech perception entails articulatory motor representations in the listener. In the present work, we investigate the perception of a singing voice, a stimulus that is not primarily linked to articulatory processes. Does listening to a singing voice induce activity in the motor system? Is this motor activity stronger for a voice than for a non-biological musical sound? Two behavioral tasks, a og virtual lesionfg{} paradigm using TMS, the study of brain oscillations with EEG and an fMRI experiment carried out during my PhD have shed some light on these questions. Our results show that the perception of a singing voice is indeed associated with sensorimotor activity in repetition and discrimination tasks. Interestingly, the poorer singers displayed the stronger motor resonance. The motor system could facilitate the processing of sound or the preparation of the vocal response by internal model generation when the acoustic processing is not effective enough. The set of studies presented here thus suggests that audiomotor interactions in human voice perception are modulated by two factors: the biological dimension of sound and the listeners' vocal expertise. These results suggest new perspectives on our understanding of the auditory-vocal loop in speech and of sound perception in general.
|
4 |
Littérature et musique : Essai poétique d'une prose narrative musicalisée dans Ritournelle de la Faim de Jean-Marie Gustave le Clézio, Tous les matins du monde de Pascal Quignard, Les ruines de Paris de Jacques Réda et Jazz de Toni Morrison / Poetic essay of a narrative musical prose in every morning by Pascal Quignard, ,jingle of hanger by Jean-Marie Gustave Le Clezio, Ruins of Paris by Jacques Reda and Jazz by Toni MorrisonOllende-Etsendji, Tracy 28 November 2014 (has links)
Les concepts de littérature et de musique ont toujours été liés depuis l'antiquité grecque. En effet, cette relation est le fruit de plusieurs affinités d'ordre esthétique qui les coordonnent dans la mesure où l’un structure et l’autre matérialise les données à exposer. Autrement dit, la littérature a toujours servi de structure ou de support au XX ème pour dire les phénomènes de langage qui incluent la mathématique, l’informatique et bien d’autres domaines. Aussi, l’esthétique musicale comportant essentiellement les codes rythmiques notamment les figures de silence, les codes acoustiques (notes de musique, instruments de musique), la fréquence des notes, leur hauteur et surtout leur amplitude; sous la forme musicale de la structure narrative, c’est-à-dire une articulation sémantique et discursive des mots et expressions qui cachent en réalité les codes de l’esthétique musicale se révèle une sorte de débordement de la diégèse....C'(est donc ce travail de correspondanc entre la discontinuité sémantique et discursive du texte du XXème siècle (contaminé par les codes de musique) et la constitution d'une partition musicale qui nous aidera à mieux cerner le lien entre littérature et musique... Notre étude prendra à cet effet appui sur quatre oeuvres : Tous les matins du monde de Pascal Quignard, Ritournelle de la faim de J.M G le Clézio, Les ruines de Paris de Jacques Réda et Jazz de Toni Morrison.... / Since ancient Greece, the music and literature concepts have always been linked. In fact, this relationship had started from the consequence of several aesthetic affinities that coordinate as much as one structure and one materializes the data to expose. In other words, literature has always served as a structure or support the twentieth century to say the language phenomena including mathematics, data processing and many other fields. Also, the musical aesthetics essentially containing the rhythmic codes including “figures de silence”, acoustic codes (music notes, musical instruments), the frequency of notes, especially their height and amplitude; under the musical form of the narrative structure that’s mean, semantic and discursive articulation of words and phrases that actually conceal codes music, reveals a kind of overflow of the literary text.....So, this work of correspondence between the semantic and discursive discontinuity text of the twentieh century (contaminated by the codes of music) and the creation of a musical game that will help us better understand the link between literature and music.... Our study supports on four works (books) : All the Mornings of the World Easter Quignar Ritournelle Hunger the JMG Le Clezio, The ruins of Paris Jacques Reda and Jazz bu Toni Morrison....
|
5 |
Mother-pup recognition behaviour, pup vocal signatures and allosuckling in the New Zealand fur seal, Arctocephalus forsteriDowell, Sacha January 2005 (has links)
A recognition system is required between pinniped mothers and pups. For otariids this is especially important since females frequently leave their pups for foraging and must reunite on return. Pups must deal with these fasting periods during maternal absence and consequently may attempt to obtain allomaternal care from unrelated females. This research on the New Zealand fur seal (Arctocephalus forsteri) at Ohau Point, Kaikoura, New Zealand, quantified mother-pup recognition behaviour during reunions, individuality of pup calls used by mothers to recognise their pup, and the occurrence of allosuckling as a possible recognition error by females and as a strategy employed by pups to gain allomaternal care during their mothers' absence. A combination of behavioural observations, morphometry, VHF radio telemetry, acoustics and DNA genotyping were employed to study these topics. Postpartum interaction behaviours between mothers and pups appeared to facilitate development of an efficient mother-pup recognition system, involving mainly vocal and olfactory cues that were utilised during reunions. Greater selective pressure on pups to reunite resulted in an asymmetry of searching behaviour between females and pups during reunions. The vocalisations of pups were stereotypic, especially those features of the fundamental frequency and frequency of the lowest harmonic, which are likely to facilitate recognition of a pup by their mother. Pups attempted to steal milk from unrelated females more often during maternal absence and appeared to modify the intra-individual variation pattern of a feature of their vocal signatures over this period, which may assist attempts at allosuckling under nutritional stress. Fostering was demonstrated to occur despite costs to filial pups and possible costs to female reproductive success and may be attributed to development of erroneous recognition between females and non filial pups, or kin selection. This study provides a valuable contribution to the knowledge of recognition systems between pinniped mothers and pups, of alternative pup strategies under nutritional stress and of the rare occurrence of fostering in otariid pinnipeds.
|
6 |
Identity information in bonobo vocal communication : from sender to receiver / L’ information “identité individuelle” dans la communication vocale du bonobo : de l’émetteur au récepteurKeenan, Sumir 14 October 2016 (has links)
L’information "identité individuelle" est essentielle chez les espèces fortement sociales car elle permet la reconnaissance individuelle et la différenciation des partenaires sociaux dans de nombreux contextes tels que les relations de dominance, les relations mère-jeunes, la défense territoriale, ou encore participe à la cohésion et coordination de groupe. Chez de nombreuses espèces, le canal audio est l’une des voies les plus efficaces de communication dans des environnementscomplexes et à longue distance. Les vocalisations sont empreintes de caractéristiques acoustiques propres à la voix de chaque individu. La combinaison entre ces signatures vocales individuelles et la connaissance sociale accumulée sur les congénères peut grandement favoriser la valeur sélective des animaux, en facilitant notamment les prises de décisions sociales les plus adaptées. Le but de ma recherche est d’étudier le codage et décodage de l’information "identité individuelle" du système vocal de communication du bonobo, Pan paniscus. Premièrement, nous avons recherché la stabilité des signatures vocales des cinq types de cris les plus courants du répertoire du bonobo. Nous avons trouvé que, bien que ces cinq types de cris aient le potentiel de coder l’information individuelle, les cris les plus forts émis dans des contextes d’excitation intense et de communication à longue distance ont les signatures vocales individuelles les plus marquées. Deuxièmement, nous avons étudié l’effet de la familiarité sociale et des liens de parenté sur les caractéristiquesacoustiques qui codent l’information individuelle dans un type de cri "bark". Nous avons mis en évidence l’existence d’une forte convergence vocale. Les individus apparentés et familiers, et indépendamment l’un de l’autre, présentent plus desimilarités vocales qu’entre des individus non apparentés et non familiers. Enfin, dans une troisième étude, nous avons testé la capacité des bonobos à utiliser l’information "identité individuelle" codée dans les vocalisations pour discriminer la voix d’anciens partenaires sociaux avec qui ils ne vivent plus. Par une série d’expériences de repasse, nous avons démontré que les bonobos étaient capables de reconnaître la voix d’individus familiers sur la seule base de l’acoustique, et cela même après des années de séparation. L’ensemble de ce travail de thèse montre que le codage et décodage de l’information "identité individuelle" chez le bonobo est un système dynamique, sujet à modification avec l’environnement social mais suffisamment fiable pour permettre la reconnaissance individuelle au cours du temps. En conclusion cette étude participe à une meilleure compréhension du système de communication vocale chez un primate non-humain forestier, au réseau social unique et complexe / Identity information is vital for highly social species as it facilitates individual recognition and allows for differentiation between social partners in many contexts, such as dominance hierarchies, territorial defence, mating and parent-offspringidentification and group cohesion and coordination. In many species vocalisations can be the most effective communication channel through complex environments and over long-distances and are encoded with the stable features of an individual’s voice. Associations between these individual vocal signatures and accumulated social knowledge about conspecifics can greatly increase an animal’s fitness, as it facilitates adaptively constructive social decisions. This thesis investigates the encoding and decoding of identity information in the vocal communication system of the bonobo, Pan paniscus. We firstly investigated the stability of vocal signatures across the five most common call types in the bonobo vocal repertoire. Results showed that while all call types have the potential to code identity information, loud calls used during times of high arousal and for distance communication have the strongest individual vocal signatures. Following the first study, we investigated if social familiarity and relatedness affect the acoustic features that code individual information in the bark call type. Overall, we found strong evidence for vocal convergence, and specifically, that individuals who are related and familiar, independently from one another, are more vocally similar to one another than unrelated and unfamiliar individuals. In a final study we tested if bonobos are capable of using the encoded identity information to recognise past group members that they no longer live with. Through a series playback experiments we demonstrated that bonobos are capable of recognising familiar individuals from vocalisations alone even after years of separation. Collectively, the results of this thesis show that the encoding and decoding of identity information in bonobo vocalisations is a dynamic system, subject to modification through social processes but robust enough to allow for individual recognition over time. In conclusion these studies contribute to a better understanding of the vocal communication system of a non-human primate species with a unique and complex social network
|
7 |
Pant-grunts in wild chimpanzees (Pan troglodytes schweinfurthii) : the vocal development of a social signalLaporte, Marion N. C. January 2011 (has links)
While the gestural communication of apes is widely recognised as intentional and flexible, their vocal communication still remains considered as mostly genetically determined and emotionally bound. Trying to limit the direct projections of linguistic concepts, that are far from holding a unified view on what constitute human language, this thesis presents a detailed description of the pant-grunt vocalisation usage and development in the chimpanzees (Pan troglodytes schweinfurthii) of the Budongo forest, Uganda. Pant-grunts are one of the most social vocalisations of the chimpanzee vocal repertoire and are always given from a subordinate individual to a dominant. The question of how such a signal is used and develops is critical for our understanding of chimpanzee social and vocal complexity in an ontogenetical and phylogenetical perpective. Results suggest that pant-grunt vocalisations can be used in a flexible way, both in their form and usage within a social group. More specifically, chimpanzees seemed to take into account the number and identity of surrounding individuals before producing these vocalisations. At the acoustic level, pant-grunts seem to be very variable vocalisations that corresponded to different social situations commonly encountered. Grunts are one of the first vocalisations produced by babies but they are not first produced in social contexts. Although some modifications of the social grunts form and usage could not entirely be attributed to maturation only, the role of the mother seemed to be restricted. Her direct influence was perhaps more visible in the rhythmic patterns of chorusing events. Taken together, this thesis suggests that chimpanzee vocalisations are more flexible in their usage, production and acquisition than previously thought and might therefore be more similar to gestural communication.
|
8 |
De l'intimité à la complicité : la chanson-action comme organisateur de l'attention chez le bébé de trois à six mois. / From intimacy to complicity : how do action-songs organize attention in 3- to 6-months-old infants.Delavenne, Anne 17 February 2011 (has links)
L’objectif de cette recherche est d’étudier la coordination temporelle entre l’organisation de la performance maternelle (chant et gestes des mains), les variations de l’attention visuelle et la vocalisation du bébé à 3 mois et à 6 mois au cours de la ‘chanson-action’ française ‘les marionnettes’. Le terme ‘chanson-action’ désigne des routines interactives associant une ‘mise en scène’ des mains (ici de la mère) coordonnée à un chant pour bébé. Le ‘chanter-bébé’ fait partie des premières stimulations musicales du bébé. Des recherches expérimentales ont mis en évidence qu’il possède plusieurs fonctions et qu’en particulier il maintient l’attention du bébé au cours des échanges. Les ‘chansons-actions’ apparaissent dans le cours des interactions précoces un peu avant le milieu de la première année du bébé. Or des expériences ont montré qu’à cet âge précoce le bébé semble déjà capable de partager son attention entre la mère et un autre centre d’intérêt dans certaines conditions, en particulier lorsque la mère manipule un objet familier. Les ‘chansons-actions’ apparaissent alors comme une situation naturelle permettant d’étudier l’aptitude du bébé à partager son attention. Nous avons donc cherché à tester l’hypothèse centrale selon laquelle l’organisation hiérarchisée de la performance maternelle et en particulier du chant devait fournir un cadre organisant les variations dynamiques de l’attention visuelle du bébé. Nous pensions de plus que l’organisation temporelle de ses vocalisations devait refléter sa sensibilité à cette organisation musicale et qu’il devait vocaliser à des moments saillants de la chanson-action. Nous avons choisi d’étudier les échanges de mêmes dyades à 3 mois et à 6 mois car ces âges sont périphériques à l’émergence spontanée des ‘chansons-actions’ dans le répertoire des routines interactives de la mère et du bébé. Nous voulions ainsi explorer l’évolution de la coordination entre l’organisation temporelle de la performance maternelle et les variations de l’attention du bébé.Nous avons étudié les échanges de 20 dyades à 3 mois (12 dyades ‘mère-garçon’ et 8 dyades ‘mère-fille’) et de 18 de ces mêmes dyades à 6 mois (10 dyades ‘mère-garçon’ et 8 dyades ‘mère-fille’). L’originalité de notre recherche est d’explorer l’évolution dynamique des variations de l’attention du bébé au cours de la chanson-action... / The aim of this study was to analyze the temporal coordination between the organization of the maternal performance (singing and hand gesturing), the visual attention and vocalization of the infant at 3- and 6-months during sequences of the French ‘action-song’ ‘les marionnettes’. The word ‘action-song’ is used to refer to interactive routines that combine hand gestures coordinated with a baby song. Infant-directed singing has been shown to be among the first musical stimulations addressed infants. Experimental studies have demonstrated that it possesses several functions and that one of them is maintaining the infant’s attention. Action-songs emerge in interactions at about 4-months. Experimental studies have proved that the infant exhibits the ability to share her attention between the mother and another object of interest, in particular when it is a familiar object that the mother holds in her hands. Actions-songs appear to be a natural situation that allows us to study this ability. We tested the central hypothesis that the hierarchical levels of maternal singing performance would provide a frame that would organize the dynamic variations of the infant’s visual attention. Moreover the infant would vocalize at specific moments of the musical structure of maternal singing. We studied the exchanges of the same dyads at 3- and 6-months to explore the developmental trajectory of the coordination between the temporal organization of the maternal performance and the infant’s visual attention.We studied the interactions of 20 dyads at 3 months (12 boys and 8 girls) and of 18 of those dyads at 6 months (10 boys and 8 girls). We performed both video and acoustic microanalyses to study the dynamic variations of infant’s attention during the action-song. Thus, each gaze orientation of the infant was associated with a specific element of the maternal singing. Our results showed that maternal singing was articulated at three hierarchical temporal levels (verse, line, pulse) both at 3- and at 6-months. At 3 months the infant’s attention was oriented mainly towards the mother’s face. The variations of the infant’s attention were coordinated with the phrasing of maternal singing and the infant reoriented her attention towards the hands just before the end of the verse. Infants’ vocalizations also occurred at the end of the verse. At 6-months, infants were more attentive to the mother’s hands. Six-month-olds reoriented her attention towards the mother’s face at the end of the verse. Infants’ vocalizations were synchronized with pulse of the maternal singing. Furthermore our results exhibited gender differences at 3- and at 6-months. The performance of mothers of boys was more regular than the performance of mothers of girls. We suggest that actions-songs provide a frame that scaffolds the ability of the infant to share her attention between the mother’s face and hands.
|
9 |
Using other minds : transparency as a fundamental design consideration for artificial intelligent systemsWortham, Robert H. January 2018 (has links)
The human cognitive biases that result in anthropomorphism, the moral confusion surrounding the status of robots, and wider societal concerns related to the deployment of artificial intelligence at scale all motivate the study of robot transparency --- the design of robots such that they may be fully understood by humans. Based on the hypothesis that robot transparency leads to better (in the sense of more accurate) mental models of robots, I investigate how humans perceive and understand a robot when they encounter it, both in online video and direct physical encounter. I also use Amazon Mechanical Turk as a platform to facilitate online experiments with larger population samples. To improve transparency I use a visual real-time transparency tool providing a graphical representation of the internal processing and state of a robot. I also describe and deploy a vocalisation algorithm for transparency. Finally, I modify the form of the robot with a simple bee-like cover, to investigate the effect of appearance on transparency. I find that the addition of a visual or vocalised representation of the internal processing and state of a robot significantly improves the ability of a naive observer to form an accurate model of a robot's capabilities, intentions and purpose. This is a significant result across a diverse, international population sample and provides a robust result about humans in general, rather than one geographic, ethnic or socio-economic group in particular. However, all the experiments were unable to achieve a Mental Model Accuracy (MMA) of more than 59%, indicating that despite improved transparency of the internal state and processing, naive observers' models remain inaccurate, and there is scope for further work. A vocalising, or 'talking', robot greatly increases the confidence of naive observers to report that they understand a robot's behaviour when observed on video. Perhaps we might be more easily deceived by talking robots than silent ones. A zoomorphic robot is perceived as more intelligent and more likeable than a very similar mechanomorphic robot, even when the robots exhibit almost identical behaviour. A zoomorphic form may attract closer visual attention, and whilst this results in an improved MMA, it also diverts attention away from transparency measures, reducing their efficacy to further increase MMA. The trivial embellishment of a robot to alter its form has significant effects on our understanding and attitude towards it. Based on the concerns that motivate this work, together with the results of the robot transparency experiments, I argue that we have a moral responsibility to make robots transparent, so as to reveal their true machine nature. I recommend the inclusion of transparency as a fundamental design consideration for intelligent systems, particularly for autonomous robots. This research also includes the design and development of the 'Instinct' reactive planner, developed as a controller for a mobile robot of my own design. Instinct provides facilities to generate a real-time 'transparency feed'--- a real-time trace of internal processing and state. Instinct also controls agents within a simulation environment, the 'Instinct Robot World'. Finally, I show how two instances of Instinct can be used to achieve a second order control architecture.
|
10 |
Recognizing emotions in spoken dialogue with acoustic and lexical cuesTian, Leimin January 2018 (has links)
Automatic emotion recognition has long been a focus of Affective Computing. It has become increasingly apparent that awareness of human emotions in Human-Computer Interaction (HCI) is crucial for advancing related technologies, such as dialogue systems. However, performance of current automatic emotion recognition is disappointing compared to human performance. Current research on emotion recognition in spoken dialogue focuses on identifying better feature representations and recognition models from a data-driven point of view. The goal of this thesis is to explore how incorporating prior knowledge of human emotion recognition in the automatic model can improve state-of-the-art performance of automatic emotion recognition in spoken dialogue. Specifically, we study this by proposing knowledge-inspired features representing occurrences of disfluency and non-verbal vocalisation in speech, and by building a multimodal recognition model that combines acoustic and lexical features in a knowledge-inspired hierarchical structure. In our study, emotions are represented with the Arousal, Expectancy, Power, and Valence emotion dimensions. We build unimodal and multimodal emotion recognition models to study the proposed features and modelling approach, and perform emotion recognition on both spontaneous and acted dialogue. Psycholinguistic studies have suggested that DISfluency and Non-verbal Vocalisation (DIS-NV) in dialogue is related to emotions. However, these affective cues in spoken dialogue are overlooked by current automatic emotion recognition research. Thus, we propose features for recognizing emotions in spoken dialogue which describe five types of DIS-NV in utterances, namely filled pause, filler, stutter, laughter, and audible breath. Our experiments show that this small set of features is predictive of emotions. Our DIS-NV features achieve better performance than benchmark acoustic and lexical features for recognizing all emotion dimensions in spontaneous dialogue. Consistent with Psycholinguistic studies, the DIS-NV features are especially predictive of the Expectancy dimension of emotion, which relates to speaker uncertainty. Our study illustrates the relationship between DIS-NVs and emotions in dialogue, which contributes to Psycholinguistic understanding of them as well. Note that our DIS-NV features are based on manual annotations, yet our long-term goal is to apply our emotion recognition model to HCI systems. Thus, we conduct preliminary experiments on automatic detection of DIS-NVs, and on using automatically detected DIS-NV features for emotion recognition. Our results show that DIS-NVs can be automatically detected from speech with stable accuracy, and auto-detected DIS-NV features remain predictive of emotions in spontaneous dialogue. This suggests that our emotion recognition model can be applied to a fully automatic system in the future, and holds the potential to improve the quality of emotional interaction in current HCI systems. To study the robustness of the DIS-NV features, we conduct cross-corpora experiments on both spontaneous and acted dialogue. We identify how dialogue type influences the performance of DIS-NV features and emotion recognition models. DIS-NVs contain additional information beyond acoustic characteristics or lexical contents. Thus, we study the gain of modality fusion for emotion recognition with the DIS-NV features. Previous work combines different feature sets by fusing modalities at the same level using two types of fusion strategies: Feature-Level (FL) fusion, which concatenates feature sets before recognition; and Decision-Level (DL) fusion, which makes the final decision based on outputs of all unimodal models. However, features from different modalities may describe data at different time scales or levels of abstraction. Moreover, Cognitive Science research indicates that when perceiving emotions, humans make use of information from different modalities at different cognitive levels and time steps. Therefore, we propose a HierarchicaL (HL) fusion strategy for multimodal emotion recognition, which incorporates features that describe data at a longer time interval or which are more abstract at higher levels of its knowledge-inspired hierarchy. Compared to FL and DL fusion, HL fusion incorporates both inter- and intra-modality differences. Our experiments show that HL fusion consistently outperforms FL and DL fusion on multimodal emotion recognition in both spontaneous and acted dialogue. The HL model combining our DIS-NV features with benchmark acoustic and lexical features improves current performance of multimodal emotion recognition in spoken dialogue. To study how other emotion-related tasks of spoken dialogue can benefit from the proposed approaches, we apply the DIS-NV features and the HL fusion strategy to recognize movie-induced emotions. Our experiments show that although designed for recognizing emotions in spoken dialogue, DIS-NV features and HL fusion remain effective for recognizing movie-induced emotions. This suggests that other emotion-related tasks can also benefit from the proposed features and model structure.
|
Page generated in 0.0954 seconds