Spelling suggestions: "subject:"informationretrieval"" "subject:"informationsretrieval""
1 |
A Geometric Approach to Pattern Matching in Polyphonic MusicTanur, Luke January 2005 (has links)
The music pattern matching problem involves finding matches of a small fragment of music called the "pattern" into a larger body of music called the "score". We represent music as a series of horizontal line segments in the plane, and reformulate the problem as finding the best translation of a small set of horizontal line segments into a larger set of horizontal line segments. We present an efficient algorithm that can handle general weight models that measure the musical quality of a match of the pattern into the score, allowing for approximate pattern matching.
We give an algorithm with running time <em>O</em>(<em>nm</em>(<em>d</em> + log <em>m</em>)), where <em>n</em> is the size of the score, <em>m</em> is the size of the pattern, and <em>d</em> is the size of the discrete set of musical pitches used. Our algorithm compares favourably to previous approaches to the music pattern matching problem. We also demonstrate that this geometric formulation of the music pattern matching problem is unlikely to have a significantly faster algorithm since it is at least as hard as 3SUM, a basic problem that is conjectured to have no subquadratic algorithm. Lastly, we present experiments to show how our algorithm can find musically sensible variations of a theme, as well as polyphonic musical patterns in a polyphonic score.
|
2 |
Music metadata capture in the studio from audio and symbolic dataHargreaves, Steven January 2014 (has links)
Music Information Retrieval (MIR) tasks, in the main, are concerned with the accurate generation of one of a number of different types of music metadata {beat onsets, or melody extraction, for example. Almost always, they operate on fully mixed digital audio recordings. Commonly, this means that a large amount of signal processing effort is directed towards the isolation, and then identification, of certain highly relevant aspects of the audio mix. In some cases, results of one MIR algorithm are useful, if not essential, to the operation of another { a chord detection algorithm for example, is highly dependent upon accurate pitch detection. Although not clearly defined in all cases, certain rules exist which we may take from music theory in order to assist the task { the particular note intervals which make up a specific chord, for example. On the question of generating accurate, low level music metadata (e.g. chromatic pitch and score onset time), a potentially huge advantage lies in the use of multitrack, rather than mixed, audio recordings, in which the separate instrument recordings may be analysed in isolation. Additionally, in MIR, as in many other research areas currently, there is an increasing push towards the use of the Semantic Web for publishing metadata using the Resource Description Framework (RDF). Semantic Web technologies, though, also facilitate the querying of data via the SPARQL query language, as well as logical inferencing via the careful creation and use of web ontology language (OWL) ontologies. This, in turn, opens up the intriguing possibility of deferring our decision regarding which particular type of MIR query to ask of our low-level music metadata until some point later down the line, long after all the heavy signal processing has been carried out. In this thesis, we describe an over-arching vision for an alternative MIR paradigm, built around the principles of early, studio-based metadata capture, and exploitation of open, machine-readable Semantic Web data. Using the specific example of structural segmentation, we demonstrate that by analysing multitrack rather than mixed audio, we are able to achieve a significant and quantifiable increase in the accuracy of our segmentation algorithm. We also provide details of a new multitrack audio dataset with structural segmentation annotations, created as part of this research, and available for public use. Furthermore, we show that it is possible to fully implement a pair of pattern discovery algorithms (the SIA and SIATEC algorithms { highly applicable, but not restricted to, symbolic music data analysis) using only SemanticWeb technologies { the SPARQL query language, acting on RDF data, in tandem with a small OWL ontology. We describe the challenges encountered by taking this approach, the particular solution we've arrived at, and we evaluate the implementation both in terms of its execution time, and also within the wider context of our vision for a new MIR paradigm.
|
3 |
A Geometric Approach to Pattern Matching in Polyphonic MusicTanur, Luke January 2005 (has links)
The music pattern matching problem involves finding matches of a small fragment of music called the "pattern" into a larger body of music called the "score". We represent music as a series of horizontal line segments in the plane, and reformulate the problem as finding the best translation of a small set of horizontal line segments into a larger set of horizontal line segments. We present an efficient algorithm that can handle general weight models that measure the musical quality of a match of the pattern into the score, allowing for approximate pattern matching.
We give an algorithm with running time <em>O</em>(<em>nm</em>(<em>d</em> + log <em>m</em>)), where <em>n</em> is the size of the score, <em>m</em> is the size of the pattern, and <em>d</em> is the size of the discrete set of musical pitches used. Our algorithm compares favourably to previous approaches to the music pattern matching problem. We also demonstrate that this geometric formulation of the music pattern matching problem is unlikely to have a significantly faster algorithm since it is at least as hard as 3SUM, a basic problem that is conjectured to have no subquadratic algorithm. Lastly, we present experiments to show how our algorithm can find musically sensible variations of a theme, as well as polyphonic musical patterns in a polyphonic score.
|
4 |
Modèles de compression et critères de complexité pour la description et l'inférence de structure musicale / Compression models and complexity criteria for the description and the inference of music structureGuichaoua, Corentin 19 September 2017 (has links)
Une définition très générale de la structure musicale consiste à considérer tout ce qui distingue la musique d'un bruit aléatoire comme faisant partie de sa structure. Dans cette thèse, nous nous intéressons à l'aspect macroscopique de cette structure, en particulier la décomposition de passages musicaux en unités autonomes (typiquement, des sections) et à leur caractérisation en termes de groupements d'entités élémentaires conjointement compressibles. Un postulat de ce travail est d'établir un lien entre l'inférence de structure musicale et les concepts de complexité et d'entropie issus de la théorie de l'information. Nous travaillons ainsi à partir de l'hypothèse que les segments structurels peuvent être inférés par des schémas de compression de données. Dans une première partie, nous considérons les grammaires à dérivation unique (GDU), conçues à l'origine pour la découverte de structures répétitives dans les séquences biologiques (Gallé, 2011), dont nous explorons l'utilisation pour modéliser les séquences musicales. Cette approche permet de compresser les séquences en s'appuyant sur leurs statistiques d'apparition, leur organisation hiérarchique étant modélisée sous forme arborescente. Nous développons plusieurs adaptations de cette méthode pour modéliser des répétitions inexactes et nous présentons l'étude de plusieurs critères visant à régulariser les solutions obtenues. La seconde partie de cette thèse développe et explore une approche novatrice d'inférence de structure musicale basée sur l'optimisation d'un critère de compression tensorielle. Celui-ci vise à compresser l'information musicale sur plusieurs échelles simultanément en exploitant les relations de similarité, les progressions logiques et les systèmes d'analogie présents dans les segments musicaux. La méthode proposée est introduite d'un point de vue formel, puis présentée comme un schéma de compression s'appuyant sur une extension multi-échelle du modèle Système & Contraste (Bimbot et al., 2012) à des patrons tensoriels hypercubiques. Nous généralisons de surcroît l'approche à d'autres patrons tensoriels, irréguliers, afin de rendre compte de la grande variété d'organisations structurelles des segments musicaux. Les méthodes étudiées dans cette thèse sont expérimentées sur une tâche de segmentation structurelle de données symboliques correspondant à des séquences d'accords issues de morceaux de musique pop (RWC-Pop). Les méthodes sont évaluées et comparées sur plusieurs types de séquences d'accords, et les résultats établissent l'attractivité des approches par critère de complexité pour l'analyse de structure et la recherche d'informations musicales, les meilleures variantes fournissant des performances de l'ordre de 70% de F-mesure. / A very broad definition of music structure is to consider what distinguishes music from random noise as part of its structure. In this thesis, we take interest in the macroscopic aspects of music structure, especially the decomposition of musical pieces into autonomous segments (typically, sections) and their characterisation as the result of the grouping process of jointly compressible units. An important assumption of this work is to establish a link between the inference of music structure and information theory concepts such as complexity and entropy. We thus build upon the hypothesis that structural segments can be inferred through compression schemes. In a first part of this work, we study Straight-Line Grammars (SLGs), a family of formal grammars originally used for structure discovery in biological sequences (Gallé, 2011), and we explore their use for the modelisation of musical sequences. The SLG approach enables the compression of sequences, depending on their occurrence frequencies, resulting in a tree-based modelisation of their hierarchical organisation. We develop several adaptations of this method for the modelisation of approximate repetitions and we develop several regularity criteria aimed at improving the efficiency of the method. The second part of this thesis develops and explores a novel approach for the inference of music structure, based on the optimisation of a tensorial compression criterion. This approach aims to compress the musical information on several simultaneous time-scales by exploiting the similarity relations, the logical progressions and the analogy systems which are embedded in musical segments. The proposed method is first introduced from a formal point of view, then presented as a compression scheme rooted in a multi-scale extension of the System & Contrast model (Bimbot et al., 2012) to hypercubic tensorial patterns. Furthermore, we generalise the approach to other, irregular, tensorial patterns, in order to account for the great variety of structural organisations observed in musical segments. The methods presented in this thesis are tested on a structural segmentation task using symbolic data, chords sequences from pop music (RWC-Pop). The methods are evaluated and compared on several sets of chord sequences, and the results establish an experimental advantage for the approaches based on a complexity criterion for the analysis of structure in music information retrieval, with the best variants offering F-measure scores around 70%. To conclude this work, we recapitulate its main contributions and we discuss possible extensions of the studied paradigms, through their application to other musical dimensions, the inclusion of musicological knowledge, and their possible use on audio data.
|
5 |
Bridging the gap: cognitive approaches to musical preference using large datasetsBarone, Michael D. 11 1900 (has links)
Using a large dataset of digital music downloads, this thesis examines the extent to which cognitive-psychology research can generate and predict user behaviours relevant to the distinct fields of computer science and music perception. Three distinct topics are explored. Topic one describes the current difficulties with using large digital music resources for cognitive research and provides a solution by linking metadata through a complex validation process. Topic two uses this enriched information to explore the extent to which extracted acoustic features influence genre preferences considering personality, and mood research; analysis suggests acoustic features which are pronounced in an individual's preferred genre influence choice when selecting less-preferred genres. Topic three examines whether metrics of music listening behaviour can be derived and validated by social psychological research; results support the notion that user behaviours can be derived and validated using an informed psychological background, and may be more useful than acoustic features for a variety of computational music tasks. A primary motivation for this thesis was to approach interdisciplinary music research in two ways: (1) utilize a shared understanding of statistical learning as a theoretical framework underpinning for prediction and interpretation; and (2) by providing resources, and approaches to analysis of "big data" which are experimentally valid, and psychologically useful. The unique strengths of this interdisciplinary approach, and the weaknesses that remain, are then addressed by discussing refined analyses and future directions. / Thesis / Master of Science (MSc) / This thesis examines whether research from cognitive psychology can be used to inform and predict behaviours germane to computational music analysis including genre choice, music feature preference, and consumption patterns from data provided by digital-music platforms. Specific topics of focus include: information integrity and consistency of large datasets, whether signal processing algorithms can be used to assess music preference across multiple genres, and the degree to which consumption behaviours can be derived and validated using more traditional experimental paradigms. Results suggest that psychologically motivated research can provide useful insights and metrics in the computationally focused area of global music consumption behaviour and digital music analysis. Limitations that remain within this interdisciplinary approach are addressed by providing refined analysis techniques for future work.
|
6 |
Visualização de similaridades em bases de dados de música / Visualization of similarities in song data setsOno, Jorge Henrique Piazentin 30 June 2015 (has links)
Coleções de músicas estão amplamente disponíveis na internet e, graças ao crescimento na capacidade de armazenamento e velocidade de transmissão de dados, usuários podem ter acesso a uma quantidade quase ilimitada de composições. Isso levou a uma maior necessidade de organizar, recuperar e processar dados musicais de modo automático. Visualização de informação é uma área de pesquisa que possibilita a análise visual de grandes conjuntos de dados e, por isso, é uma ferramenta muito valiosa para a exploração de bibliotecas musicais. Nesta dissertação, metodologias para a construção de duas técnicas de visualização de bases de dados de música são propostas. A primeira, Grafo de Similaridades, permite a exploração da base de dados em termos de similaridades hierárquicas. A segunda, RadViz Concêntrico, representa os dados em termos de tarefas de classificação e permite que o usuário altere a visualização de acordo com seus interesses. Ambas as técnicas são capazes de revelar estruturas de interesse no conjunto de dados, facilitando o seu entendimento e exploração. / Music collections are widely available on the internet and, leveraged by the increasing storage and bandwidth capability, users can currently access a multitude of songs. This leads to a growing demand towards automated methods for organizing, retrieving and processing music data. Information visualization is a research area that allows the analysis of large data sets, thus, it is a valuable tool for the exploration of music libraries. In this thesis, methodologies for the development of two music visualization techniques are proposed. The first, Similarity Graph, enables the exploration of data sets in terms of hierarchical similarities. The second, Concentric RadViz, represents the data in terms of classification tasks and enables the user to alter the visualization according to his interests. Both techniques are able to reveal interesting structures in the data, favoring its understanding and exploration.
|
7 |
Automated analysis and transcription of rhythm data and their use for compositionBoenn, Georg January 2011 (has links)
No description available.
|
8 |
Explaining listener differences in the perception of musical structureSmith, Jordan January 2014 (has links)
State-of-the-art models for the perception of grouping structure in music do not attempt to account for disagreements among listeners. But understanding these disagreements, sometimes regarded as noise in psychological studies, may be essential to fully understanding how listeners perceive grouping structure. Over the course of four studies in different disciplines, this thesis develops and presents evidence to support the hypothesis that attention is a key factor in accounting for listeners' perceptions of boundaries and groupings, and hence a key to explaining their disagreements. First, we conduct a case study of the disagreements between two listeners. By studying the justi cations each listener gave for their analyses, we argue that the disagreements arose directly from differences in attention, and indirectly from differences in information, expectation, and ontological commitments made in the opening moments. Second, in a large-scale corpus study, we study the extent to which acoustic novelty can account for the boundary perceptions of listeners. The results indicate that novelty is correlated with boundary salience, but that novelty is a necessary but not su cient condition for being perceived as a boundary. Third, we develop an algorithm that optimally reconstructs a listener's analysis in terms of the patterns of similarity within a piece of music. We demonstrate how the output can identify good justifications for an analysis and account for disagreements between two analyses. Finally, having introduced and developed the hypothesis that disagreements between listeners may be attributable to differences in attention, we test the hypothesis in a sequence of experiments. We find that by manipulating the attention of participants, we are able to influence the groupings and boundaries they find most salient. From the sum of this research, we conclude that a listener's attention is a crucial factor affecting how listeners perceive the grouping structure of music.
|
9 |
Language of music : a computational model of music interpretationMcLeod, Andrew Philip January 2018 (has links)
Automatic music transcription (AMT) is commonly defined as the process of converting an acoustic musical signal into some form of musical notation, and can be split into two separate phases: (1) multi-pitch detection, the conversion of an audio signal into a time-frequency representation similar to a MIDI file; and (2) converting from this time-frequency representation into a musical score. A substantial amount of AMT research in recent years has concentrated on multi-pitch detection, and yet, in the case of the transcription of polyphonic music, there has been little progress. There are many potential reasons for this slow progress, but this thesis concentrates on the (lack of) use of music language models during the transcription process. In particular, a music language model would impart to a transcription system the background knowledge of music theory upon which a human transcriber relies. In the related field of automatic speech recognition, it has been shown that the use of a language model drawn from the field of natural language processing (NLP) is an essential component of a system for transcribing spoken word into text, and there is no reason to believe that music should be any different. This thesis will show that a music language model inspired by NLP techniques can be used successfully for transcription. In fact, this thesis will create the blueprint for such a music language model. We begin with a brief overview of existing multi-pitch detection systems, in particular noting four key properties which any music language model should have to be useful for integration into a joint system for AMT: it should (1) be probabilistic, (2) not use any data a priori, (3) be able to run on live performance data, and (4) be incremental. We then investigate voice separation, creating a model which achieves state-of-the-art performance on the task, and show that, used as a simple music language model, it improves multi-pitch detection performance significantly. This is followed by an investigation of metrical detection and alignment, where we introduce a grammar crafted for the task which, combined with a beat-tracking model, achieves state-of-the-art results on metrical alignment. This system's success adds more evidence to the long-existing hypothesis that music and language consist of extremely similar structures. We end by investigating the joint analysis of music, in particular showing that a combination of our two models running jointly outperforms each running independently. We also introduce a new joint, automatic, quantitative metric for the complete transcription of an audio recording into an annotated musical score, something which the field currently lacks.
|
10 |
D'émotion et de GRACE : vers un modèle computationnel unifié des émotions : application à l'écoute musicale d'un robot danseur / Emotions and GRACE : towards a computational model of emotions : application into a music-listening activity of a dancing robotDang, Thi-Hai-Ha 13 January 2012 (has links)
Les psychologues (comme A. Damasio, K. R. Scherer, P. Ekman) ont montré que l’émotion est un élément essentiel dans la prise de décision, dans l’évolution des capacités d’apprentissage et de création, dans l’interaction sociale. Il est donc naturel de s'intéresser à l'expression d'émotions dans le cadre de l'interaction homme-machine. Nous avons proposé dans un premier temps le modèle GRACE, modèle générique des émotions pour les applications computationnelles. Nous nous sommes basés en particulier sur la théorie psychologique de K. R. Scherer, qui cherche à produire une théorie des processus émotionnels qui soit modélisable et calculable. La pertinence de notre modèle a été vérifiée et validée via une comparaison avec différents modèles computationnels existants. Si le modèle GRACE est générique, nous nous sommes attachés à montrer qu’il pouvait s’instancier dans un contexte particulier, en l’occurrence l’interaction homme-robot utilisant la modalité musicale. Nous nous sommes intéressés pour cela d’une part à la conception d’un module d’analyse du contenu émotionnel d’une séquence musicale, d’autre part à la conception de mouvements émotionnellement expressifs pour un robot mobile. Du point de vue de l’analyse musicale, la contribution principale de la thèse porte sur la proposition d’un ensemble réduit d’indicateurs musicaux et sur la validation du module d’analyse sur une base de données de grande taille conçue par un expert en musicologie. Du point de vue de la robotique, nous avons pu montrer expérimentalement qu’un robot avec des capacités expressives très limitées (déplacements, mouvements de caméra) pouvait néanmoins exprimer de manière satisfaisante un ensemble réduit d’émotions simples (joie, colère, tristesse, sérénité). / Emotion, as psychologists argue (like A. Damasio, K. R. Scherer, P. Ekman), is an essential factor for human beings in making decision, learning, inventing things, and interacting with others. Based on this statement, researchers in Human-Machine Interaction have been interested in adding emotional abilities to their applications. With the same goal of studying emotional abilities, we propose, in our work, a model of emotions named GRACE, which helps in modelling emotions in computational applications. We based our model on the work of psychologist Klaus R. Scherer, who intensively searches to form a generic model of emotion applicable to computational domain (like informatics, robotics, etc.). We demonstrate the pertinence of our model by comparing it to other existing models of emotions in the field of informatics and robotics. In this thesis, we also worked on the instantiation of GRACE, in particular the components Cognitive Interpretation and Expression. These two components have been developed to be applied in the context of interacting with users using music. To develop Cognitive Interpretation, we worked on the extraction of emotional content in musical excerpts. Our contribution consists in proposing a reduced number of musical features to efficiently extract the emotional content in music, and in validating them via a learning system with a large database designed by a musicologist. For Expression, we have worked on the design of emotional moves of a mobile robot. Through very limited moves (moves in space, camera moves), we have shown that with dance-inspired motions, the robot could efficiently convey basic emotions (i.e. happiness, sadness, anger, serenity) to people.
|
Page generated in 0.1182 seconds