• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

De l'usage des métadonnées dans l'objet sonore / The use of sound objects metadata

Debaecker, Jean 09 October 2012 (has links)
La reconnaissance des émotions dans la musique est un challenge industriel et académique. À l’heure de l’explosion des contenus multimédias, il nous importe de concevoir des ensembles structurés de termes, concepts et métadonnées facilitant l’organisation et l’accès aux connaissances. Notre problématique est la suivante : est-Il possible d'avoir une connaissance a priori de l'émotion en vue de son élicitation ? Autrement dit, dans quelles mesures est-Il possible d'inscrire les émotions ressenties à l'écoute d'une oeuvre musicale dans un régime de métadonnées et de bâtir une structure formelle algorithmique permettant d'isoler le mécanisme déclencheur des émotions ? Est-Il possible de connaître l'émotion que l'on ressentira à l'écoute d'une chanson, avant de l'écouter ? Suite à l'écoute, son élicitation est-Elle possible ? Est-Il possible de formaliser une émotion dans le but de la sauvegarder et de la partager ? Nous proposons un aperçu de l'existant et du contexte applicatif ainsi qu'une réflexion sur les enjeux épistémologiques intrinsèques et liés à l'indexation même de l'émotion : à travers lune démarche psychologique, physiologique et philosophique, nous proposerons un cadre conceptuel de cinq démonstrations faisant état de l'impossible mesure de l'émotion, en vue de son élicitation. Une fois dit à travers notre cadre théorique qu'il est formellement impossible d'indexer les émotions, il nous incombe de comprendre la mécanique d'indexation cependant proposée par les industriels et académiques. Nous proposons, via l'analyse d'enquêtes quantitatives et qualitatives, la production d'un algorithme effectuant des préconisationsd'écoute d’œuvres musicales. / Emotion recognition in music is an industrial and academic challenge. In the age of multimedia content explosion, we mean to design structured sets of terms, concepts and metadata facilitating access to organized knowledge. Here is our research question : can we have an a priori knowledge of emotion that could be elicited afterwards ? In other words, to what extent can we record emotions felt while listening to music, so as to turn them into metadata ? Can we create an algorithm enabling us to detect how emotions are released ? Are we likely to guess ad then elicit the emotion an individual will feel before listening to a particular song ? Can we formalize emotions to save, record and share them ? We are giving an overview of existing research, and tackling intrinsic epistemological issues related to emotion existing, recording and sharing out. Through a psychological, physiological ad philosophical approach, we are setting a theoretical framework, composed of five demonstrations which assert we cannot measure emotions in order to elicit them. Then, a practical approach will help us to understand the indexing process proposed in academic and industrial research environments. Through the analysis of quantitative and qualitative surveys, we are defining the production of an algorithm, enabling us to recommend musical works considering emotion.
2

Multimedia Forensics Using Metadata

Ziyue Xiang (17989381) 21 February 2024 (has links)
<p dir="ltr">The rapid development of machine learning techniques makes it possible to manipulate or synthesize video and audio information while introducing nearly indetectable artifacts. Most media forensics methods analyze the high-level data (e.g., pixels from videos, temporal signals from audios) decoded from compressed media data. Since media manipulation or synthesis methods usually aim to improve the quality of such high-level data directly, acquiring forensic evidence from these data has become increasingly challenging. In this work, we focus on media forensics techniques using the metadata in media formats, which includes container metadata and coding parameters in the encoded bitstream. Since many media manipulation and synthesis methods do not attempt to hide metadata traces, it is possible to use them for forensics tasks. First, we present a video forensics technique using metadata embedded in MP4/MOV video containers. Our proposed method achieved high performance in video manipulation detection, source device attribution, social media attribution, and manipulation tool identification on publicly available datasets. Second, we present a transformer neural network based MP3 audio forensics technique using low-level codec information. Our proposed method can localize multiple compressed segments in MP3 files. The localization accuracy of our proposed method is higher compared to other methods. Third, we present an H.264-based video device matching method. This method can determine if the two video sequences are captured by the same device even if the method has never encountered the device. Our proposed method achieved good performance in a three-fold cross validation scheme on a publicly available video forensics dataset containing 35 devices. Fourth, we present a Graph Neural Network (GNN) based approach for the analysis of MP4/MOV metadata trees. The proposed method is trained using Self-Supervised Learning (SSL), which increased the robustness of the proposed method and makes it capable of handling missing/unseen data. Fifth, we present an efficient approach to compute the spectrogram feature with MP3 compressed audio signals. The proposed approach decreases the complexity of speech feature computation by ~77.6% and saves ~37.87% of MP3 decoding time. The resulting spectrogram features lead to higher synthetic speech detection performance.</p>

Page generated in 0.067 seconds