Global ETD Search

621	Une approche du patching audio collaboratif : enjeux et développement du collecticiel Kiwi. / An approach of collaborative audio patching : challenges and development of the Kiwi groupware Paris, Eliott 05 December 2018 (has links) Les logiciels de patching audio traditionnels, tels que Max ou Pure Data, sont des environnements qui permettent de concevoir et d’exécuter des traitements sonores en temps réel. Ces logiciels sont mono-utilisateurs, or, dans bien des cas, les utilisateurs ont besoin de travailler en étroite collaboration à l’élaboration ou à l’exécution d’un même traitement. C’est notamment le cas dans un contexte pédagogique ainsi que pour la création musicale collective. Des solutions existent, mais ne conviennent pas forcément à tous les usages. Aussi avons-nous cherché à nous confronter de manière concrète à cette problématique en développant une nouvelle solution de patching audio collaborative, baptisée Kiwi, qui permet l’élaboration d’un même traitement sonore à plusieurs mains de manière distribuée. À travers une étude critique des solutions logicielles existantes nous donnons des clefs de compréhension pour appréhender la conception d’un système multi-utilisateur de ce type. Nous énonçons les principaux verrous que nous avons eu à lever pour rendre cette pratique viable et présentons la solution logicielle. Nous exposons les possibilités offertes par l’application et les choix de mise en œuvre techniques et ergonomiques que nous avons faits pour permettre à plusieurs personnes de coordonner leurs activités au sein d’un espace de travail mis en commun. Nous revenons ensuite sur différents cas d’utilisation de ce collecticiel dans un contexte pédagogique et de création musicale afin d’évaluer la solution proposée. Nous exposons enfin les développements plus récents et ouvrons sur les perspectives futures que cette application nous permet d’envisager. / Traditional audio patching software, such as Max or Pure Data, are environments that allow you to design and execute sound processing in real time. These programs are single-user, but, in many cases, users need to work together and in a tight way to create and play the same sound processing. This is particularly the case in a pedagogical context and for collective musical creation. Solutions exist, but are not necessarily suitable for all uses. We have tried to confront this problem in a concrete way by developing a new collaborative audio patching solution, named Kiwi, which allows the design of a sound processing with several hands in a distributed manner. Through a critical study of the existing software solutions we give keys of comprehension to apprehend the design of a multi-user system of this type. We present the main barriers that we had to lift to make this practice viable and present the software solution. We show the possibilities offered by the application and the technical and ergonomic implementation choices that we have made to allow several people to coordinate their activities within a shared workspace. Then, we study several uses of this groupware in pedagogical and musical creation contexts in order to evaluate the proposed solution. Finally, we present the recent developments and open up new perspectives for the application. Informatique musicale Patching audio Collecticiel Interfaces multi-utilisateurs Computer music Audio patching Groupware Multi-user interfaces
622	Project Think: Transforming history into new knowledge Young, Susan Heather 01 January 2007 (has links) Project THINK was designed as a classroom project that combined the use of instructional multimedia technology, linked to the California History/Social Science standards, which engaged gifted middle school students in the design of these standards-based video materials. Instructional systems Design Audio-visual education Media programs (Education) Audio-visual education Instructional systems Design Media programs (Education) Instructional Media Design
623	Modèles de Mélanges Conjugués pour la Modélisation de la Perception Visuelle et Auditive Khalidov, Vasil 18 October 2010 (has links) (PDF) Dans cette thèse, nous nous intéressons à la modélisation de la perception audio-visuelle avec une tête robotique. Les problèmes associés, notamment la calibration audio-visuelle, la détection, la localisation et le suivi d'objets audio-visuels sont étudiés. Une approche spatio-temporelle de calibration d'une tête robotique est proposée, basée sur une mise en correspondance probabiliste multimodale des trajectoires. Le formalisme de modèles de mélange conjugué est introduit ainsi qu'une famille d'algorithmes d'optimisation efficaces pour effectuer le regroupement multimodal. Un cas particulier de cette famille d'algorithmes, notamment l'algorithme EM conjugue, est amélioré pour obtenir des propriétés théoriques intéressantes. Des méthodes de détection d'objets multimodaux et d'estimation du nombre d'objets sont développées et leurs propriétés théoriques sont étudiées. Enﬁn, la méthode de regroupement multimodal proposée est combinée avec des stratégies de détection et d'estimation du nombre d'objets ainsi qu'avec des techniques de suivi pour effectuer le suivi multimodal de plusieurs objets. La performance des méthodes est démontrée sur des données simulées et réelles issues d'une base de données de scénarios audio-visuels réalistes (base de données CAVA). modeles de mélanges conjugués analyse audio-visuel de scène calibration audio-visuelle détection multimodale d'objets suivi multimodal d'objets
624	Transformées redondantes pour la représentation de signaux audio : application au codage et à l'indexation Ravelli, Emmanuel 27 October 2008 (has links) (PDF) Cette thèse étudie de nouvelles techniques de représentation du signal pour le codage audio. Les codeurs audio existants sont basés soit sur une transformée (codage par transformée), soit sur un modèle paramétrique (codage paramétrique), soit sur une combinaison des deux (codage hybride). D'une part, le codage par transformée permet une qualité transparente à haut débit (ex. AAC à 64 kbps/canal), mais obtient de mauvaises performances à bas débit. D'autre part, le codage paramétrique et le codage hybride obtiennent de meilleures performances que le codage par transformée à haut débit mais ne permettent pas une qualité transparente à haut débit. La nouvelle approche de représentation du signal que nous proposons permet d'obtenir une qualité transparente à haut débit et de meilleures performances que le codage par transformée à bas débit. Cette représentation du signal est basée sur un ensemble redondant de fonctions temps-fréquence composée d'une union de plusieurs bases MDCT à différentes échelles. La première contribution majeure de cette thèse est un algorithme à la fois rapide et performant qui décompose un signal dans cette ensemble redondant de fonctions. La deuxième contribution majeure de cette thèse est un ensemble de techniques qui permettent un codage de ces représentations à la fois performant et progressif. Finalement, cette thèse étudie l'application à l'indexation audio. Nous montrons que l'utilisation d'une union de plusieurs MDCT permet de dépasser les limitations des représentations utilisées dans les codeurs par transformée (en particulier la résolution fréquentielle), ce qui rend ainsi possible une indexation dans le domaine transformée performant. traitement du signal représentation des signaux représentations parcimonieuses transformées temps-fréquence codage audio quantification indexation audio classification
625	Feature selection for multimodal: acoustic Event detection Butko, Taras 08 July 2011 (has links) Acoustic Event Detection / The detection of the Acoustic Events (AEs) naturally produced in a meeting room may help to describe the human and social activity. The automatic description of interactions between humans and environment can be useful for providing: implicit assistance to the people inside the room, context-aware and content-aware information requiring a minimum of human attention or interruptions, support for high-level analysis of the underlying acoustic scene, etc. On the other hand, the recent fast growth of available audio or audiovisual content strongly demands tools for analyzing, indexing, searching and retrieving the available documents. Given an audio document, the first processing step usually is audio segmentation (AS), i.e. the partitioning of the input audio stream into acoustically homogeneous regions which are labelled according to a predefined broad set of classes like speech, music, noise, etc. Acoustic event detection (AED) is the objective of this thesis work. A variety of features coming not only from audio but also from the video modality is proposed to deal with that detection problem in meeting-room and broadcast news domains. Two basic detection approaches are investigated in this work: a joint segmentation and classification using Hidden Markov Models (HMMs) with Gaussian Mixture Densities (GMMs), and a detection-by-classification approach using discriminative Support Vector Machines (SVMs). For the first case, a fast one-pass-training feature selection algorithm is developed in this thesis to select, for each AE class, the subset of multimodal features that shows the best detection rate. AED in meeting-room environments aims at processing the signals collected by distant microphones and video cameras in order to obtain the temporal sequence of (possibly overlapped) AEs that have been produced in the room. When applied to interactive seminars with a certain degree of spontaneity, the detection of acoustic events from only the audio modality alone shows a large amount of errors, which is mostly due to the temporal overlaps of sounds. This thesis includes several novelties regarding the task of multimodal AED. Firstly, the use of video features. Since in the video modality the acoustic sources do not overlap (except for occlusions), the proposed features improve AED in such rather spontaneous scenario recordings. Secondly, the inclusion of acoustic localization features, which, in combination with the usual spectro-temporal audio features, yield a further improvement in recognition rate. Thirdly, the comparison of feature-level and decision-level fusion strategies for the combination of audio and video modalities. In the later case, the system output scores are combined using two statistical approaches: weighted arithmetical mean and fuzzy integral. On the other hand, due to the scarcity of annotated multimodal data, and, in particular, of data with temporal sound overlaps, a new multimodal database with a rich variety of meeting-room AEs has been recorded and manually annotated, and it has been made publicly available for research purposes. Acoustic event Audio classification Audio segmentation Feature selection Multimodal Feature extraction Fuzzy integral Online systems Support vector machines Hidden marko models 531/534
626	Video Segmentation Based On Audio Feature Extraction Atar, Neriman 01 February 2009 (has links) (PDF) In this study, an automatic video segmentation and classification system based on audio features has been presented. Video sequences are classified such as videos with &ldquo / speech&rdquo / , &ldquo / music&rdquo / , &ldquo / crowd&rdquo / and &ldquo / silence&rdquo / . The segments that do not belong to these regions are left as &ldquo / unclassified&rdquo / . For the silence segment detection, a simple threshold comparison method has been done on the short time energy feature of the embedded audio sequence. For the &ldquo / speech&rdquo / , &ldquo / music&rdquo / and &ldquo / crowd&rdquo / segment detection a multiclass classification scheme has been applied. For this purpose, three audio feature set have been formed, one of them is purely MPEG-7 audio features, other is the audio features that is used in [31] the last one is the combination of these two feature sets. For choosing the best feature a histogram comparison method has been used. Audio segmentation system was trained and tested with these feature sets. The evaluation results show that the Feature Set 3 that is the combination of other two feature sets gives better performance for the audio classification system. The output of the classification system is an XML file which contains MPEG-7 audio segment descriptors for the video sequence. An application scenario is given by combining the audio segmentation results with visual analysis results for getting audio-visual video segments.
627	L'utilisation du film dans l'enseignement du français langue étrangère au niveau débutant à l'Université du KwaZulu-Natal, Pietermaritzburg : une étude de cas. Dye, Marie Françoise Ghyslaine. January 2009 (has links) No abstract available. / Thesis (M.A.)-University of KwaZulu-Natal, Pietermaritzburg, 2009. French language--Study and teaching. Audio-lingual method (Language teaching) Theses--French.
628	Structural analysis and segmentation of music signals Ong, Bee Suan 21 February 2007 (has links) Con la reciente explosión cuantitativa de bibliotecas y colecciones de música en formatodigital, la descripción del contenido desempeña un papel fundamental para una gestión ybúsqueda eficientes de archivos de audio. La presente tesis doctoral pretende hacer unanálisis automático de la estructura de piezas musicales a partir del análisis de unagrabación, es decir, extraer una descripción estructural a partir de señales musicalespolifónicas. En la medida en que la repetición y transformación de la estructura de lamúsica genera una identificación única de una obra musical, extraer automáticamenteesta información puede vincular entre sí descripciones de bajo y alto nivel de una señalmusical y puede proporcionar al usuario una manera más efectiva de interactuar con uncontenido de audio. Para algunas aplicaciones basadas en contenido, encontrar los límitesde determinados segmentos de una grabación resulta indispensable. Así pues, también seinvestiga la segmentación temporal de audio a nivel semántico, al igual que laidentificación de extractos representativos de una señal musical que pueda servir comoresumen de la misma. Para ello se emplea una técnica de análisis a un nivel deabstracción más elevado que permite obtener una mejor división en segmentos. Tantodesde el punto de vista teórico como práctico, esta investigación no sólo ayuda aincrementar nuestro conocimiento respecto a la estructura musical, sino que tambiénproporciona una ayuda al examen y a la valoración musical. / With the recent explosion in the quantity of digital audio libraries and databases, contentdescriptions play an important role in efficiently managing and retrieving audio files.This doctoral research aims to discover and extract structural description frompolyphonic music signals. As repetition and transformations of music structure creates aunique identity of music itself, extracting such information can link low-level and higherleveldescriptions of music signal and provide better quality access plus powerful way ofinteracting with audio content. Finding appropriate boundary truncations is indispensablein certain content-based applications. Thus, temporal audio segmentation at the semanticlevel and the identification of representative excerpts from music audio signal are alsoinvestigated. We make use of higher-level analysis technique for better segmenttruncation. From both theoretical and practical points of view, this research not onlyhelps in increasing our knowledge of music structure but also facilitates in time-savingbrowsing and assessing of music. audio segmentation music structural analysis music content description identificación de fragmentos musicales segmentación de audio análisis estructural de la música descripción de contenido musical 004 531/534 78
629	Caractérisation de la voix de l'enfant sourd appareillé et implanté cochléaire : approches acoustique et perceptuelle et proposition de modélisation / Characterizing the voice of the fitted and cochlear implanted deaf children : acoustic and perceptive approaches with a view to modelling Guerrero Lopez, Harold Andrés 19 March 2010 (has links) Cette thèse propose une analyse comparative acoustique et perceptive de la voix d’un effectif statistiquement fiable d’enfants sourds appareillés et implantés cochléaires. Peu de paramètres diffèrent de manière significative entre le groupe d’enfants sourds ayant été appareillés et implantés avant l’âge de trois ans, et le groupe d’enfants entendants. L’ensemble de résultats indiquent que la voix des enfants de notre étude ne présente pas les caractéristiques traditionnellement retenues pour déterminer la voix pathologique. Par ailleurs, les caractéristiques de la voix des enfants implantés cochléaires sont sensiblement comparables à celles des enfants entendants. Fort de ces résultats expérimentaux, nous avons proposé un modèle « vibro-acoustique » de la régulation de la voix des enfants sourds « oralisés », et développé un simulateur numérique de la boucle audio-phonatoire. / This dissertation presents an acoustic and perceptive comparative analysis of the voice in a reliable size group of fitted and cochlear implanted deaf children. There are very few significantly different parameters between fitted and implanted children before three years old and normal children. Results do not confirm that hearing-impaired children’s voices of our study are pathological. Furthermore, characteristics of cochlear implanted voices are nearly comparable to normal children’s voices. As a consequence of these results, we propose a « vibro-acoustic model » and a software of voice control mechanism in deaf children. Voix Implant cochléaire Enfants Paramètres acoustiques Surdité Modèle Prothèse auditive Boucle audio-phonatoire Voice Children Deafness Hearing aid Cochlear implant Acoustic parameters Model Audio-phonatory loop
630	L'audiovision dans le cinéma d'animation : contribution à la sémiotique / Audio-Vision in Animation : A semiotic approach Conde Aldana, Juan Alberto 16 December 2016 (has links) Le sujet de ce travail est le rôle du son dans le cinéma d’animation, à partir du concept d’audio-vision. Ce concept, provenant du théoricien Michel Chion, exprime une activité ou une modalité de la perception qui s’active –est qui est la face subjective– des produits audiovisuels qui dans le cas de l’animation donnent lieu à des expériences singulières. Le lieu de rencontre entre ces instances est la scène sémiotique d’une expérience de sens où une forme de textualité (résultat d’une écriture sonore et visuelle) inscrite dans un support matériel s’active et rencontre un corps vivant, qui la transforme en un récit audiovisuel, ou bien aussi en une pure expérience corporelle, rythmique ou vibratoire, qui est aussi une forme de sens. Cette scène pratique est comprise d’après le concept de dispositif, provenant de la théorie filmique, mais ici liée à la théorie de l’énonciation. Pour aborder cette scène si complexe, j’ai choisi de mettre l’accent sur un aspect particulier : le son, l’écoute, et les approches sémiotiques que se sont occupées de cette dimension de l’expérience perceptive, notamment la sémiotique du son. Pour tester cette approche, dans cette thèse on fait des analyses pratiques d’un type de productions animées que, selon le critère, on peut appeler « indépendantes », « expérimentales », ou même « abstraites » : les courts métrages animés de Norman McLaren et de Mirai Mizue. Néanmoins, pour éloignées que semblent ces productions de l’animation traditionnelle on va proposer qu’à la base de toutes ces productions il y a un même dispositif et un même type de pratique (dans le sens sémiotique du terme), un « faire technique » sur lequel on projettera des contenus divers (narratifs, dynamiques, musicaux, énergétiques). / The subject of this work is the role of sound in animation, from the concept of audio-vision. This concept, from theorist Michel Chion, expresses an activity or mode of perception that is activated –and that is is the subjective side- of audiovisual productions. In the case of animation this activity gives rise to unique experiences. The meeting point between these aspects is the semiotic scene of a meaningful experience, where a form of textuality (the result of an aural and a visual writing) registered in a physical medium, is activated by a living body. This experience is interpreted as an audio-visual narrative, or as a pure physical experience of rhythm and vibration, which is also a form of sense. This practical scene is understood from the concept of ‘apparatus’, coined in the field of the film theory, but connected here with enunciation theory. To address this complex scene, I chose to focus on a particular aspect: sound and listening, from a semiotic perspective. To test this approach, in this work I propose practical analysis of a type of animated productions that according to the criteria, can be called Mirai Mizue Norman McLaren Synchrèse figural Dispositif animatique Audio-vision Mirai Mizue Norman McLaren Figural Synchresis Animatic apparatus Audio-vision 410

Search results