741 |
Gestural musical interfaces using real time machine learningDasari, Sai Sandeep January 1900 (has links)
Master of Science / Department of Computer Science / William H. Hsu / We present gestural music instruments and interfaces that aid musicians and audio engineers to express themselves efficiently. While we have mastered building a wide variety of physical instruments, the quest for virtual instruments and sound synthesis is on the rise. Virtual instruments are essentially software that enable musicians to interact with a sound module in the computer. Since the invention of MIDI (Musical Instrument Digital Interface), devices and interfaces to interact with sound modules like keyboards, drum machines, joysticks, mixing and mastering systems have been flooding the music industry.
Research in the past decade gone one step further in interacting through simple musical gestures to create, shape and arrange music in real time. Machine learning is a powerful tool that can be smartly used to teach simple gestures to the interface. The ability to teach innovative gestures and shape the way a sound module behaves unleashes the untapped creativity of an artist. Timed music and multimedia programs such as Max/MSP/Jitter along with machine learning techniques open gateways to embodied musical experiences without physical touch. This master's report presents my research, observations and how this interdisciplinary field of research could be used to study wider neuroscience problems like embodied music cognition and human-computer interactions.
|
742 |
Videogame como linguagem audiovisual: compreensão e aplicação em um estudo de caso - super street fighterCorrêa, Francisco Tupy Gomes 27 September 2013 (has links)
A relação entre os aspectos fílmicos e lúdicos dos jogos vão muito além da palavra videogame. Essa terminologia frente à humanidade é muito recente, apenas com algumas dezenas de anos, porém, trata-se de algo que converge em símbolos, técnicas e práticas, revestindo tais itens contemporâneos e impactando a sociedade significativamente. Logo, configura-se como um objeto de estudo capaz de promover reflexões diversas. No caso desta pesquisa, consideramos que nessa expressão do ato de jogar existe muito de game e pouco de vídeo. As abordagens focando as narrativas e as mecânicas muitas vezes deixam uma lacuna referente aos processos audiovisuais. Em função desta observação, foi realizado um estudo de cunho hipotético-dedutivo e metapórico preconizando uma visão integradora de três métodos distintos: a Ludologia, a Narratologia e a Linguagem Audiovisual. A motivação do título escolhido para o estudo de caso, Super Street Fighter IV, ocorreu por ser um jogo popular, dentro de um gênero típico de jogabilidade, que há mais de duas décadas se reinventa para agradar os fãs. Além disso, dialoga diretamente com temas culturais (costumes asiáticos e artes marciais) e temas cinematográficos (filmes de luta e o ícone representado por Bruce Lee). O estudo focou os resultados visando contribuir para uma compreensão do videogame, de modo que a pesquisa realizada pudesse trazer parâmetros tanto para aprofundar elementos presentes em sua linguagem quanto para o desenvolvimento de questões de ligadas à sua realização. / The relation between cinematographic and ludic aspects of games go well beyond the Word videogame. This term is very recent in human history, having appeared only a few decades ago. However, it is something the converges in symbols, techniques and practices, labelling these contemporary items and significantly impacting society. For this reason, it presents itself as a study subject that is capable of promoting diverse reflections and discussions. In this research, we considered that in this expression of the act of playing there is much more game than video. Approaches that focus on narratives and mechanics often don\'t refer to audiovisual processes. Based on this observation, we devised a hypothetic-deductive, and metaphoric study that considers three distinct methods: Ludology, Narratology, and Audiovisual Language. The motivation behind the chosen subject of study, Super Street Fighter IV, comes form it being a popular game, from a genre that is mainly based on playability, with more than two decades of reinvention in order to please fans. Besides, it directly dialogs with cultural (Asiatic costumes and martial arts) and cinematographic themes (fight movies and the popular icon Bruce Lee). The study focused the results in order to contribute to a comprehension of video-games, in a way that the research could bring parameters both to deepen elements that help better understanding this language, and to aid the development of questions concerned to realization.
|
743 |
Testing TraditionDaniel, Boner, East Tennessee State University Bluegrass Band 01 January 2012 (has links)
Smithhaven--Still Making Excuses--If Seeing Is Believing--Myth--Your Last Ride--If I Had a Dollar--Tack and Jibe--Stewie Took My Nose--March Home to Me--I Recall--God's Work Is Never Done--Farther Down the Track. / https://dc.etsu.edu/etsu_books/1101/thumbnail.jpg
|
744 |
The Johnson City Sessions 1928-1929: Can You Sing Or Play Old-Time Music?Olson, Ted 01 January 2013 (has links)
The Johnson City Sessions were held in Johnson City, Tennessee in October 1928 and October 1929. This work "...marks the first time these recordings have been assembled in any format. Collectively, these 100 songs and tunes are regarded by scholars and record collectors as a strong and distinctive cross-section of old-time Appalachian music just before the Great Depression. The four CDs gather every surviving recording from the sessions, while the accompanying 136-page LP-sized hardcover book contains newly researched essays on the background to the sessions and on the individual artists, with many rare and hitherto unpublished photographs, as well as complete song lyrics and a detailed discography." -- Back cover.
Ted Olson (East Tennessee State University) and Tony Russell are the re-issue producers. / https://dc.etsu.edu/etsu_books/1112/thumbnail.jpg
|
745 |
Blind Alfred Reed: Appalachian VisionaryOlson, Ted 01 January 2015 (has links)
Liner notes by Ted Olson, song lyrics, and discography; produced by Ted Olson.
"In this collection, all of Reed’s songs, both faith-based and secular, recorded for the Victor Talking Machine Company over two sessions in 1927 in Bristol TN and Camden, NJ and two sessions in 1929 in New York City, are on one 22 track CD, complemented by well researched essays by Producer Ted Olson and LOTS of archival photos. Reed played fiddle and sang and on some sessions he was accompanied on guitar by his son Orville. ... Olson has included the younger Reed’s solo recordings." --Steve Ramm Review on Amazon / https://dc.etsu.edu/etsu_books/1117/thumbnail.jpg
|
746 |
The orchestration of modes and EFL audio-visual comprehension: A multimodal discourse analysis of vodcastsNorte Fernández-Pacheco, Natalia 27 January 2016 (has links)
This thesis explores the role of multimodality in language learners’ comprehension, and more specifically, the effects on students’ audio-visual comprehension when different orchestrations of modes appear in the visualization of vodcasts. Firstly, I describe the state of the art of its three main areas of concern, namely the evolution of meaning-making, Information and Communication Technology (ICT), and audio-visual comprehension. One of the most important contributions in the theoretical overview is the suggested integrative model of audio-visual comprehension, which attempts to explain how students process information received from different inputs. Secondly, I present a study based on the following research questions: ‘Which modes are orchestrated throughout the vodcasts?’, ‘Are there any multimodal ensembles that are more beneficial for students’ audio-visual comprehension?’, and ‘What are the students’ attitudes towards audio-visual (e.g., vodcasts) compared to traditional audio (e.g., audio tracks) comprehension activities?’. Along with these research questions, I have formulated two hypotheses: Audio-visual comprehension improves when there is a greater number of orchestrated modes, and students have a more positive attitude towards vodcasts than traditional audios when carrying out comprehension activities. The study includes a multimodal discourse analysis, audio-visual comprehension tests, and students’ questionnaires. The multimodal discourse analysis of two British Council’s language learning vodcasts, entitled English is GREAT and Camden Fashion, using ELAN as the multimodal annotation tool, shows that there are a variety of multimodal ensembles of two, three and four modes. The audio-visual comprehension tests were given to 40 Spanish students, learning English as a foreign language, after the visualization of vodcasts. These comprehension tests contain questions related to specific orchestrations of modes appearing in the vodcasts. The statistical analysis of the test results, using repeated-measures ANOVA, reveal that students obtain better audio-visual comprehension results when the multimodal ensembles are constituted by a greater number of orchestrated modes. Finally, the data compiled from the questionnaires, conclude that students have a more positive attitude towards vodcasts in comparison to traditional audio listenings. Results from the audio-visual comprehension tests and questionnaires prove the two hypotheses of this study.
|
747 |
Reflections on the use of a smartphone to facilitate qualitative research in South AfricaMatlala, Sogo France, Matlala, Makoko Neo 10 January 2018 (has links)
Journal article published in The Qualitative Report 2018 Volume 23, Number 10, How to Article 2, 2264-2275 / This paper describes conditions that led to the use of a smartphone to collect qualitative data instead of using a digital voice recorder as the standard device for recording of interviews. Through reviewing technical documents, the paper defines a smartphone and describes its applications that are useful in the research process. It further points out possible uses of other applications of a smartphone in the research process. The
paper concludes that a smartphone is a valuable device to researchers
|
748 |
No Foolin? Fake News and A.I. Manipulation of Audio, Video, and ImagesTolley, Rebecca 09 February 2019 (has links)
No description available.
|
749 |
Audio-tactile displays to improve learnability and perceived urgency of alarming stimuliMomenipour, Amirmasoud 01 August 2019 (has links)
Based on cross-modal learning and multiple resources theory, human performance can be improved by receiving and processing additional streams of information from the environment. In alarm situations, alarm meanings need to be distinguishable from each other and learnable for users. In audible alarms, by manipulating the temporal characteristics of sounds different audible signals can be generated. However, in some cases such as in using discrete medical alarms, when there are too many audible signals to manage, changes in temporal characteristics may not generate discriminable signals that would be easy for listeners to learn. Multimodal displays can be developed to generate additional auditory, visual, and tactile stimuli for helping humans benefit from cross-modal learning and multiple attentional resources for a better understanding of the alarm situations. In designing multimodal alarm displays in work domains where the alarms are predominantly auditory-based and where accessing visual displays is not possible at all times, tactile displays can enhance the effectiveness of alarms by providing additional streams of information for understanding the alarms. However, because of low information density of tactile information presentation, the use of tactile alarms has been limited. In this thesis, by using human subjects, the learnability of auditory and tactile alarms, separately and together in an audio-tactile display were studied. The objective of the study was to test cross-modal learning when messages of an alarm (i.e. meaning, urgency level) were conveyed simultaneously in audible, tactile and audio-tactile alarm displays. The alarm signals were designed by using spatial characteristics of tactile, and temporal characteristics of audible signals separately in audible and tactile displays as well as together in an audio-tactile display. This study explored if using multimodal alarms (tactile and audible) would help learning unimodal (audible or tactile) alarm meanings and urgency levels. The findings of this study can help for design of more efficient discrete audio-tactile alarms that promote learnability of alarm meanings and urgency levels.
|
750 |
Diarization, Localization and Indexing of Meeting ArchivesVajaria, Himanshu 21 February 2008 (has links)
This dissertation documents the research performed on the topics of localization, diarization and indexing in meeting archives. It surveys existing work in these areas, identifies opportunities for improvements and proposes novel solutions for each of these problems. The framework resulting from this dissertation enables various kinds of queries such as identifying the participants of a meeting, finding all meetings for a particular participant, locating a particular individual in the video and finding all instances of speech from a particular individual. Also, since the proposed solutions are computationally efficient, require no training and use little domain knowledge, they can be easily ported to other domains of multimedia analysis.
Speaker diarization involves determining the number of distinct speakers and identifying the durations when they spoke in an audio recording. We propose novel solutions for the segmentation and clustering sub-tasks, based on graph spectral clustering. The resulting system yields a diarization error rate of around 20%, a relative improvement of 16% over the current popular diarization technique which is based on hierarchical clustering.
The most significant contribution of this work lies in performing speaker localization using only a single camera and a single microphone by exploiting long term audio-visual co-occurence. Our novel computational model allows identifying regions in the image belonging to the speaker even when the speaker's face is non-frontal and even when the speaker is only partially visible. This approach results in a hit ratio of 73.8% compared to an MI based approach which results in a hit ratio of 52.6%, which illustrates its suitability in the meeting domain.
The third problem addresses indexing meeting archives to enable retrieving all segments from the archive during which a particular individual speaks, in a query by example framework. By performing audio-visual association and clustering, a target cluster is generated per individual that contains multiple multimodal samples for that individual to which a query sample is matched. The use of multiple samples results in a retrieval precision of 92.6% at 90% recall compared to a precision of 71% at the same recall, achieved by a unimodal unisample system.
|
Page generated in 0.0525 seconds