Global ETD Search

61	Contributions à la sonification d’image et à la classification de sons Toffa, Ohini Kafui 11 1900 (has links) L’objectif de cette thèse est d’étudier d’une part le problème de sonification d’image et de le solutionner à travers de nouveaux modèles de correspondance entre domaines visuel et sonore. D’autre part d’étudier le problème de la classification de son et de le résoudre avec des méthodes ayant fait leurs preuves dans le domaine de la reconnaissance d’image. La sonification d’image est la traduction de données d’image (forme, couleur, texture, objet) en sons. Il est utilisé dans les domaines de l’assistance visuelle et de l’accessibilité des images pour les personnes malvoyantes. En raison de sa complexité, un système de sonification d’image qui traduit correctement les données d’image en son de manière intuitive n’est pas facile à concevoir. Notre première contribution est de proposer un nouveau système de sonification d’image de bas-niveau qui utilise une approche hiérarchique basée sur les caractéristiques visuelles. Il traduit, à l’aide de notes musicales, la plupart des propriétés d’une image (couleur, gradient, contour, texture, région) vers le domaine audio, de manière très prévisible et donc est facilement ensuite décodable par l’être humain. Notre deuxième contribution est une application Android de sonification de haut niveau qui est complémentaire à notre première contribution car elle implémente la traduction des objets et du contenu sémantique de l’image. Il propose également une base de données pour la sonification d’image. Finalement dans le domaine de l’audio, notre dernière contribution généralise le motif binaire local (LBP) à 1D et le combine avec des descripteurs audio pour faire de la classification de sons environnementaux. La méthode proposée surpasse les résultats des méthodes qui utilisent des algorithmes d’apprentissage automatique classiques et est plus rapide que toutes les méthodes de réseau neuronal convolutif. Il représente un meilleur choix lorsqu’il y a une rareté des données ou une puissance de calcul minimale. / The objective of this thesis is to study on the one hand the problem of image sonification and to solve it through new models of mapping between visual and sound domains. On the other hand, to study the problem of sound classification and to solve it with methods which have proven track record in the field of image recognition. Image sonification is the translation of image data (shape, color, texture, objects) into sounds. It is used in vision assistance and image accessibility domains for visual impaired people. Due to its complexity, an image sonification system that properly conveys the image data to sound in an intuitive way is not easy to design. Our first contribution is to propose a new low-level image sonification system which uses an hierarchical visual feature-based approach to translate, usingmusical notes, most of the properties of an image (color, gradient, edge, texture, region) to the audio domain, in a very predictable way in which is then easily decodable by the human being. Our second contribution is a high-level sonification Android application which is complementary to our first contribution because it implements the translation to the audio domain of the objects and the semantic content of an image. It also proposes a dataset for an image sonification. Finally, in the audio domain, our third contribution generalizes the Local Binary Pattern (LBP) to 1D and combines it with audio features for an environmental sound classification task. The proposed method outperforms the results of methods that uses handcrafted features with classical machine learning algorithms and is faster than any convolutional neural network methods. It represents a better choice when there is data scarcity or minimal computing power. Personnes malvoyantes synthèse audio retour auditif écran tactile accessibilité image classification de sons environnementaux modèle binaire local apprentissage automatique spectrogramme de signal audio Visually impaired sound synthesis auditory feedback touch screen image accessibility ESC Local Binary Pattern Local Phase Quantization Machine Learning Audio Signal Spectrogram
62	Multimedia Forensics Using Metadata Ziyue Xiang (17989381) 21 February 2024 (has links) <p dir="ltr">The rapid development of machine learning techniques makes it possible to manipulate or synthesize video and audio information while introducing nearly indetectable artifacts. Most media forensics methods analyze the high-level data (e.g., pixels from videos, temporal signals from audios) decoded from compressed media data. Since media manipulation or synthesis methods usually aim to improve the quality of such high-level data directly, acquiring forensic evidence from these data has become increasingly challenging. In this work, we focus on media forensics techniques using the metadata in media formats, which includes container metadata and coding parameters in the encoded bitstream. Since many media manipulation and synthesis methods do not attempt to hide metadata traces, it is possible to use them for forensics tasks. First, we present a video forensics technique using metadata embedded in MP4/MOV video containers. Our proposed method achieved high performance in video manipulation detection, source device attribution, social media attribution, and manipulation tool identification on publicly available datasets. Second, we present a transformer neural network based MP3 audio forensics technique using low-level codec information. Our proposed method can localize multiple compressed segments in MP3 files. The localization accuracy of our proposed method is higher compared to other methods. Third, we present an H.264-based video device matching method. This method can determine if the two video sequences are captured by the same device even if the method has never encountered the device. Our proposed method achieved good performance in a three-fold cross validation scheme on a publicly available video forensics dataset containing 35 devices. Fourth, we present a Graph Neural Network (GNN) based approach for the analysis of MP4/MOV metadata trees. The proposed method is trained using Self-Supervised Learning (SSL), which increased the robustness of the proposed method and makes it capable of handling missing/unseen data. Fifth, we present an efficient approach to compute the spectrogram feature with MP3 compressed audio signals. The proposed approach decreases the complexity of speech feature computation by ~77.6% and saves ~37.87% of MP3 decoding time. The resulting spectrogram features lead to higher synthetic speech detection performance.</p> Audio processing Computer vision Image and video coding Image processing Pattern recognition Video processing Digital forensics Deep learning Deepfake detection Digital forensics Video forensics Audio forensics Video metadata Audio metadata H.264 MP3 MP4 Video manipulation detection Video compression Audio compression Decision tree Deep learning Dimensionality reduction Spectrogram Graph neural networks Neural networks Transformer neural networks
63	Aspekte des „Samplings“ Braun, Stefan K. 16 July 2014 (has links) (PDF) Mash-Ups (auch Bootlegging, Bastard Pop oder Collage genannt) erfreuen sich seit Jahren steigender Beliebtheit. Waren es zu Beginn der 1990er Jahre meist nur 2 unterschiedliche Popsongs, deren Gesangs- und Instrumentenspuren in Remixform ineinander gemischt wurden, existieren heute Multi-Mash-Ups mit mehreren Dutzend gemixten und gesampelten Songs, Interpreten, Videosequenzen und Effekten. Eine Herausforderung stellt die Kombination unterschiedlichster Stile dar, diese zu neuen tanzbaren Titeln aus den Charts zu mischen. Das Mash-Up Projekt Pop Danthology z.B. enthält in einem knapp 6 minütigen aktuellen Musikclip 68 verschiedene Interpreten, u. a. Bruno Mars, Britney Spears, Rhianna und Lady Gaga. Die Verwendung und das Sampeln fremder Musik- und Videotitel kann eine Urheberrechtsverletzung darstellen. Die Komponisten des Titels „Nur mir“ mit Sängerin Sabrina Setlur unterlagen in einem Rechtsstreit, der bis zum BGH führte. Sie haben im Zuge eines Tonträger-Samplings, so der BGH , in das Tonträgerherstellerrecht der Kläger (Musikgruppe Kraftwerk) eingegriffen, in dem sie im Wege des „Sampling“ zwei Takte einer Rhythmussequenz des Titels „Metall auf Metall“ entnommen und diese im eigenen Stück unterlegt haben. Der rasante technische Fortschritt macht es mittlerweile möglich, immer einfacher, schneller und besser Musik-, Film- und Bildaufnahmen zu bearbeiten und zu verändern. Computer mit Bearbeitungssoftware haben Keyboards, Synthesizer und analoge Mehrspurtechnik abgelöst. Die Methoden des Samplings unterscheiden sich von der klassischen Raubkopie dahingehend, dass mit der Sampleübernahme eine weitreichende Umgestaltung und Bearbeitung erfolgt. Die Raubkopie zeichnet sich durch eine unveränderte Übernahme des Originals aus. Betroffen von den Auswirkungen eines nicht rechtmäßig durchgeführten Sampling sind Urheber- und Leistungsschutzrechte ausübender Künstler sowie Leistungsschutzrechte von Tonträgerherstellern. U. U. sind auch Verstöße gegen das allgemeine Persönlichkeits- und Wettbewerbsrecht Gegenstand von streitigen Auseinandersetzungen. Bastard Pop Bootlegging Einzeltonsampling Forensik Frequenzanalyse Mash-Up Mixproduktion Multisampling Phaseninvertierung Remix Sample-Medley Sampling Sound Separation Soundsampling Spektrogramm Spektrometermessungen Toncollage Zitat Tonfolgensampling audio authentication Bootlegging copyright cryptography forensics interference Mash-Up melody mix production multi-sampling neighbouring rights phase inversion real-time frequency analysis Remix Sample-Medley single sound sampling sound collage sound sampling Sound Separation sound sequence sampling spectrogram spectrometer measurement watermarking ddc:780 rvk:PE 745 rvk:LS 48660
64	Aspekte des „Samplings“: Eine Frage des Sounds? Braun, Stefan K. January 2014 (has links) Mash-Ups (auch Bootlegging, Bastard Pop oder Collage genannt) erfreuen sich seit Jahren steigender Beliebtheit. Waren es zu Beginn der 1990er Jahre meist nur 2 unterschiedliche Popsongs, deren Gesangs- und Instrumentenspuren in Remixform ineinander gemischt wurden, existieren heute Multi-Mash-Ups mit mehreren Dutzend gemixten und gesampelten Songs, Interpreten, Videosequenzen und Effekten. Eine Herausforderung stellt die Kombination unterschiedlichster Stile dar, diese zu neuen tanzbaren Titeln aus den Charts zu mischen. Das Mash-Up Projekt Pop Danthology z.B. enthält in einem knapp 6 minütigen aktuellen Musikclip 68 verschiedene Interpreten, u. a. Bruno Mars, Britney Spears, Rhianna und Lady Gaga. Die Verwendung und das Sampeln fremder Musik- und Videotitel kann eine Urheberrechtsverletzung darstellen. Die Komponisten des Titels „Nur mir“ mit Sängerin Sabrina Setlur unterlagen in einem Rechtsstreit, der bis zum BGH führte. Sie haben im Zuge eines Tonträger-Samplings, so der BGH , in das Tonträgerherstellerrecht der Kläger (Musikgruppe Kraftwerk) eingegriffen, in dem sie im Wege des „Sampling“ zwei Takte einer Rhythmussequenz des Titels „Metall auf Metall“ entnommen und diese im eigenen Stück unterlegt haben. Der rasante technische Fortschritt macht es mittlerweile möglich, immer einfacher, schneller und besser Musik-, Film- und Bildaufnahmen zu bearbeiten und zu verändern. Computer mit Bearbeitungssoftware haben Keyboards, Synthesizer und analoge Mehrspurtechnik abgelöst. Die Methoden des Samplings unterscheiden sich von der klassischen Raubkopie dahingehend, dass mit der Sampleübernahme eine weitreichende Umgestaltung und Bearbeitung erfolgt. Die Raubkopie zeichnet sich durch eine unveränderte Übernahme des Originals aus. Betroffen von den Auswirkungen eines nicht rechtmäßig durchgeführten Sampling sind Urheber- und Leistungsschutzrechte ausübender Künstler sowie Leistungsschutzrechte von Tonträgerherstellern. U. U. sind auch Verstöße gegen das allgemeine Persönlichkeits- und Wettbewerbsrecht Gegenstand von streitigen Auseinandersetzungen. info:eu-repo/classification/ddc/780 ddc:780

Page generated in 0.045 seconds