Global ETD Search

481	Applications of Crossmodal Relationships in Interfaces for Complex Systems: A Study of Temporal Synchrony Giang, Wayne Chi Wei January 2011 (has links) Current multimodal interfaces for complex systems, such as those designed using the Ecological Interface Design (EID) methodology, have largely focused on effective design of interfaces that treat each sensory modality as either an independent channel of information or as a way to provide redundant information. However, there are many times when operationally related information is presented in different sensory modalities. There is very little research that has examined how this information in different modalities can be linked at a perceptual level. When related information is presented through multiple sensory modalities, interface designers will require perceptual methods for linking relevant information together across modalities. This thesis examines one possible crossmodal perceptual relationship, temporal synchrony, and evaluates whether the relationship is useful in the design of multimodal interfaces for complex systems. Two possible metrics for the evaluation of crossmodal perceptual relationships were proposed: resistance to changes in workload, and stream monitoring awareness. Two experiments were used to evaluate these metrics. The results of the first experiment showed that temporal rate synchrony was not resistant to changes in workload, manipulated through a secondary visual task. The results of the second experiment showed that participants who used crossmodal temporal rate synchrony to link information in a multimodal interface did not achieve better performance in the monitoring of the two streams of information being presented over equivalent unimodal interfaces. Taken together, these findings suggest that temporal rate synchrony may not be an effective method for linking information across modalities. Crossmodal perceptual relationships may be very different from intra-modal perceptual relationships. However, methods for linking information across sensory modalities are still an important goal for interface designers, and a key feature of future multimodal interface design for complex systems. Ecological Interface Design Multimodal Interfaces Crossmodal Perception Interface Design System Design Engineering
482	Semantic Video Modeling And Retrieval With Visual, Auditory, Textual Sources Durak, Nurcan 01 September 2004 (has links) (PDF) The studies on content-based video indexing and retrieval aim at accessing video content from different aspects more efficiently and effectively. Most of the studies have concentrated on the visual component of video content in modeling and retrieving the video content. Beside visual component, much valuable information is also carried in other media components, such as superimposed text, closed captions, audio, and speech that accompany the pictorial component. In this study, semantic content of video is modeled using visual, auditory, and textual components. In the visual domain, visual events, visual objects, and spatial characteristics of visual objects are extracted. In the auditory domain, auditory events and auditory objects are extracted. In textual domain, speech transcripts and visible texts are considered. With our proposed model, users can access video content from different aspects and get desired information more quickly. Beside multimodality, our model is constituted on semantic hierarchies that enable querying the video content at different semantic levels. There are sequence-scene hierarchies in visual domain, background-foreground hierarchies in auditory domain, and subject hierarchies in speech domain. Presented model has been implemented and multimodal content queries, hierarchical queries, fuzzy spatial queries, fuzzy regional queries, fuzzy spatio-temporal queries, and temporal queries have been applied on video content successfully. QA Computer Software 76.75-76.765
483	Recipient response behaviour during Japanese storytelling: a combined quantitative/multimodal approach Walker, Neill Lindsey 11 1900 (has links) This study explores the role of speaker and listener gaze in the production of recipient responses, often called backchannels or, in Japanese, aizuchi. Using elicited narrative audio/video data, speaker gaze and recipient response behaviours were first analyzed quantitatively. The results showed that majority of recipient responses are made while the speaker is gazing at the recipient. Next, a qualitative multimodal analysis was performed on a specific type of recipient response that occurred both during and without speaker gaze. The results showed that recipients make good use of the state of the speakers gaze to regulate the speakers talk and negotiate for a pause, a repair, or a turn at talk. These findings suggest that what are currently known as backchannels are only a small part of a much larger sequential multimodal system that is inseparable from the ongoing talk. / Japanese Language and Linguistics backchannel aizuchi multimodal quantitative Japanese language conversation storytelling qualitative linguistics video
484	Factors influencing academics' development of interactive multimodal technology-mediated distance higher education courses Birch, Dawn P. January 2008 (has links) Advances in technology and the continued emergence of the Web as a major source of global information have encouraged tertiary educators to take advantage of this growing array of resources and move beyond traditional face-to-face and distance education correspondence modes toward a rich technology-mediated learning environment. Moreover, ready access to multimedia at the desk-top has provided an opportunity for educators to develop flexible, engaging and interactive learning resources incorporating multimedia and hypermedia. This study investigates pedagogical, individual and institutional factors influencing the adoption and integration of educational technology by academics at a regional Australian university for the purpose of developing interactive multimodal technology-mediated distance education courses. These courses include a range of multimodal learning objects and multiple representations of content in order to cater for different learning styles and modal preferences. The findings of this study revealed that a range of pedagogical, individual and institutional factors influence academics' development of interactive multimodal technology-mediated distance education courses. Implications for distance education providers and individual academics arising from these factors and subsequent recommendations are presented. distance education educational technology interactive interactivity learning modalities learning styles multimedia multimodal multiple representations technology-mediated
485	Human coordination of robot teams an empirical study of multimodal interface design / Cross, E. Vincent. Gilbert, Juan E., January 2009 (has links) Thesis (Ph. D.)--Auburn University. / Abstract. Includes bibliographical references (p. 86-89).
486	Practical Architectures for Fused Visual and Inertial Mobile Sensing Jain, Puneet January 2015 (has links) <p>Crowdsourced live video streaming from users is on the rise. Several factors such as social networks, streaming applications, smartphones with high-quality cameras, and ubiquitous wireless connectivity are contributing to this phenomenon. Unlike isolated professional videos, live streams emerge at an unprecedented scale, poorly captured, unorganized, and lack user context. To utilize the full potential of this medium and enable new services on top, immediate addressing of open challenges is required. Smartphones are resource constrained -- battery power is limited, bandwidth is scarce, on-board computing power and storage is insufficient to meet real-time demand. Therefore, mobile cloud computing is cited as an obvious alternative where cloud does the heavy-lifting for the smartphone. But, cloud resources are not cheap and real-time processing demands more than what the cloud can deliver.</p><p>This dissertation argues that throwing cloud resources at these problems and blindly offloading computation, while seemingly necessary, may not be sufficient. Opportunities need to be identified to streamline big-scale problems by leveraging in device capabilities, thereby making them amenable to a given cloud infrastructure. One of the key opportunities, we find, is the cross-correlation between different streams of information available in the cloud. We observe that inferences on a single information stream may often be difficult, but when viewed in conjunction with other information dimensions, the same problem often becomes tractable.</p> / Dissertation Computer science Cloud Computer Vision Mobile Multimodal Real-time Sensor Fusion
487	Multisemiotic resources in student assessment : a case study of one module at Stellenbosch University Du Toit, Tamzin 04 1900 (has links) Thesis (MA)--Stellenbosch University, 2014. / ENGLISH ABSTRACT: This study investigates multimodal assessment in the South African higher education context. The communication landscape of students is becoming increasingly multimodal, resulting in a shift away from higher education institutions’ preferred mode (that is, the written mode). This is partly as a result of the digital era in which we live, as the verbal, visual and audio modes co-exist to make meaning, thereby creating new forms of text (Iedema 2003: 33). Although there is a common acceptance that the communication landscape has changed, higher education institutions still seem to consider the written text and written communication as the most dominant form of meaning-making (Lea 2004: 743). Thus, there is a disparity between the types of literacies with which students arrive at university, and the types of literacies that they are expected to use in university. I argue that this disparity is problematic for education, and maintain that pedagogies be transformed in order to resolve this issue. In this way, students will be able to “benefit from learning in ways that allow them to participate fully in public, community, and economic life” (Cazden et al. 1996: 60). Data for this research includes assignments that were produced by second-year students of Applied English Language Studies, a subject offered by the Department of General Linguistics at Stellenbosch University. These assignments include a multimodal component as well as a formal, written component. Analysis of their assignments revealed that students show great dexterity in their creations of multimodal texts. Apart from their design skills, it was revealed that students have knowledge of a wide variety of social discourses, which is currently mostly ignored in the education context. Thus, I propose that this knowledge, along with the digital and visual design skills with which students arrive at university, be valorised and utilised as an entry point for the teaching of linguistic literacy. This proposal is partly supported by schema theory, a cognitive theory of learning, which entails that existing knowledge is used as a platform on which to build new knowledge. / AFRIKAANSE OPSOMMING: Hierdie studie ondersoek multimodale assessering in die Suid-Afrikaanse hoër onderwys konteks. Die kommunikasie landskap van studente word al hoe meer multimodaal wat ŉ skuif weg van die voorgekeurde modaliteit (die geskrewe) in hoër onderwys teweegbring. Dit is gedeeltelik a.g.v. die digitale era waarin ons leef waarin die verbale, visuele en klank modaliteite saam gebruik word om betekenis te skep; dus word nuwe vorme van teks geskep (Iedema 2003: 33). Alhoewel daar algemeen aanvaar word dat die kommunikasie landskap verander het, beskou hoër onderwys instansies nog steeds die geskrewe teks en geskrewe kommunikasie as die dominante vorm van betekenisskepping (Lea 2004: 743). Daar is dus ŉ gaping tussen die tipes geletterheid waarmee studente by die universiteit opdaag en watter daar van hulle verwag word om te gebruik in die universiteit. Ek voer aan dat hierdie gaping problematies is vir opvoedkunde en stel voor dat pedagogie verander moet word om dit aan te spreek. Op hierdie manier kan studente voordeel trek op maniere wat hul toelaat om ten volle deel te neem aan publieke, gemeenskaplike en ekonomiese lewe (Cazden et al. 1996: 60). Data vir hierdie navorsing sluit opdragte in wat deur tweede jaar Applied English Language Studies (ŉ vak wat deur die Departement Algemene Taalwetenskap by Stellenbosch Universiteit aangebied word) studente uitgevoer is. Die opdragte sluit ŉ multimodale element sowel as ŉ formele geskrewe element in. Analise van die opdragte wys dat studente vaardigheide het in die produksie van multimodale tekste. Behalwe die produksie vaardighede wys die analise ook dat hierdie studente kennis het van ŉ wye reeks sosiale diskoerse wat op die oomblik meestal geïgnoreer word in die opvoedkundige konteks. Ek voer dus aan dat hierdie kennis sowel as die digitale- en visuele produksie vaardigheide waarmee studente by die universiteit opdaag, gevalideer en gebruik word as ingangspoort vir die aanleer van talige geletterheid. Deels word die voorstel deur skema teorie ondersteun, ŉ teorie wat in kognitiewe benaderinge tot leer ontwikkel het en wat voorstel dat bestaande kennis gebruik kan word as ŉ platform om nuwe kennis te bou. Linguistic literacy Multimodal analysis UCTD Theses -- Linguistics Dissertations -- Linguistics
488	Metrics to evaluate human teaching engagement from a robot's point of view Novanda, Ori January 2017 (has links) This thesis was motivated by a study of how robots can be taught by humans, with an emphasis on allowing persons without programming skills to teach robots. The focus of this thesis was to investigate what criteria could or should be used by a robot to evaluate whether a human teacher is (or potentially could be) a good teacher in robot learning by demonstration. In effect, choosing the teacher that can maximize the benefit to the robot using learning by imitation/demonstration. The study approached this topic by taking a technology snapshot in time to see if a representative example of research laboratory robot technology is capable of assessing teaching quality. With this snapshot, this study evaluated how humans observe teaching quality to attempt to establish measurement metrics that can be transferred as rules or algorithms that are beneficial from a robot's point of view. To evaluate teaching quality, the study looked at the teacher-student relationship from a human-human interaction perspective. Two factors were considered important in defining a good teacher: engagement and immediacy. The study gathered more literature reviews relating to further detailed elements of engagement and immediacy. The study also tried to link physical effort as a possible metric that could be used to measure the level of engagement of the teachers. An investigatory experiment was conducted to evaluate which modality the participants prefer to employ in teaching a robot if the robot can be taught using voice, gesture demonstration, or physical manipulation. The findings from this experiment suggested that the participants appeared to have no preference in terms of human effort for completing the task. However, there was a significant difference in human enjoyment preferences of input modality and a marginal difference in the robot's perceived ability to imitate. A main experiment was conducted to study the detailed elements that might be used by a robot in identifying a 'good' teacher. The main experiment was conducted in two subexperiments. The first part recorded the teacher's activities and the second part analysed how humans evaluate the perception of engagement when assessing another human teaching a robot. The results from the main experiment suggested that in human teaching of a robot (human-robot interaction), humans (the evaluators) also look for some immediacy cues that happen in human-human interaction for evaluating the engagement.
489	Using multimodal analysis to investigate the role of the interpreter Bao-Rozee, Jie January 2016 (has links) Recent research in Interpreting Studies has favoured the argument that, in practice, the interpreter plays an active role, rather than the prescribed role stipulated in professional codes of conduct. Cutting-edge studies utilising multimodal research methods have taken a more comprehensive approach to investigating this argument, searching for evidence of the interpreter’s active involvement not only through textual analysis, but also by examining a range of non-verbal communicative means. Studies using multimodal analysis, such as those by Pasquandrea (2011) and Davitti (2012), have succeeded in offering new insights into the interpreter’s role in interaction. This research presents further investigation into the interpreter’s role through multimodal analysis by focusing on the use of gesture movements, gaze and body orientation in interpreter-mediated communication; it also looks at the impact of the state of knowledge asymmetry on the interpreter’s role. This thesis presents findings from six simulated face-to-face dialogue interpreting cases featuring three different groups of participants and interpreters representing different interpreting settings (e.g. parent-teacher meeting, business meeting, doctor-patient meeting, etc.). By adapting a multimodal approach, findings of this study (a) contribute to our understanding of the active role of the interpreter in Interpreting Studies by exploring new insights from a multimodal approach, and (b) offer new empirical findings from interpreter-mediated interactions to the technical analysis of multimodal communication. 418
490	Active and deep learning for multimedia / Apprentissage actif et profond pour le multimédia Budnik, Mateusz 24 February 2017 (has links) Les thèmes principaux abordés dans cette thèse sont l'utilisation de méthodes d'apprentissage actif et d'apprentissage profond dans le contexte du traitement de documents multimodaux. Les contributions proposées dans cette thèse abordent ces deux thèmes. Un système d'apprentissage actif a été introduit pour permettre une annotation plus efficace des émissions de télévision grâce à la propagation des étiquettes, à l'utilisation de données multimodales et à des stratégies de sélection efficaces. Plusieurs scénarios et expériences ont été envisagés dans le cadre de l'identification des personnes dans les vidéos, en prenant en compte l'utilisation de différentes modalités (telles que les visages, les segments de la parole et le texte superposé) et différentes stratégies de sélection. Le système complet a été validé au cours d'un ``test à blanc'' impliquant des annotateurs humains réels.Une deuxième contribution majeure a été l'étude et l'utilisation de l'apprentissage profond (en particulier les réseaux de neurones convolutifs) pour la recherche d'information dans les vidéos. Une étude exhaustive a été réalisée en utilisant différentes architectures de réseaux neuronaux et différentes techniques d'apprentissage telles que le réglage fin (fine-tuning) ou des classificateurs plus classiques comme les SVMs. Une comparaison a été faite entre les caractéristiques apprises (la sortie des réseaux neuronaux) et les caractéristiques plus classiques (``engineered features''). Malgré la performance inférieure des seconds, une fusion de ces deux types de caractéristiques augmente la performance globale.Enfin, l'utilisation d'un réseau neuronal convolutif pour l'identification des locuteurs à l'aide de spectrogrammes a été explorée. Les résultats ont été comparés à ceux obtenus avec d'autres systèmes d'identification de locuteurs récents. Différentes approches de fusion ont également été testées. L'approche proposée a permis d'obtenir des résultats comparables à ceux certains des autres systèmes testés et a offert une augmentation de la performance lorsqu'elle est fusionnée avec la sortie du meilleur système. / The main topics of this thesis include the use of active learning-based methods and deep learning in the context of retrieval of multimodal documents. The contributions proposed during this thesis address both these topics. An active learning framework was introduced, which allows for a more efficient annotation of broadcast TV videos thanks to the propagation of labels, the use of multimodal data and selection strategies. Several different scenarios and experiments were considered in the context of person identification in videos, including using different modalities (such as faces, speech segments and overlaid text) and different selection strategies. The whole system was additionally validated in a dry run involving real human annotators.A second major contribution was the investigation and use of deep learning (in particular the convolutional neural network) for video retrieval. A comprehensive study was made using different neural network architectures and training techniques such as fine-tuning or using separate classifiers like SVM. A comparison was made between learned features (the output of neural networks) and engineered features. Despite the lower performance of the engineered features, fusion between these two types of features increases overall performance.Finally, the use of convolutional neural network for speaker identification using spectrograms is explored. The results are compared to other state-of-the-art speaker identification systems. Different fusion approaches are also tested. The proposed approach obtains comparable results to some of the other tested approaches and offers an increase in performance when fused with the output of the best system. Annotation documents Multimodaux Multilingues Multimédia Apprentissage profond Document annotation Multimodal Multilingual Multimedia Deep learning 004

Search results