Spelling suggestions: "subject:"informationretrieval"" "subject:"informationsretrieval""
41 |
Computational modeling of improvisation in Turkish folk music using Variable-Length Markov ModelsSenturk, Sertan 31 August 2011 (has links)
The thesis describes a new database of uzun havas, a non-metered structured improvisation form in Turkish folk music, and a system, which uses Variable-Length Markov Models (VLMMs) to predict the melody in the uzun hava form. The database consists of 77 songs, encompassing 10849 notes, and it is used to train multiple viewpoints, where each event in a musical sequence are represented by parallel descriptors such as Durations and Notes. The thesis also introduces pitch-related viewpoints that are specifically aimed to model the unique melodic properties of makam music. The predictability of the system is quantitatively evaluated by an entropy based scheme. In the experiments, the results from the pitch-related viewpoints mapping 12-tone-scale of Western classical theory and 17 tone-scale of Turkish folk music are compared. It is shown that VLMMs are highly predictive in the note progressions of the transcriptions of uzun havas. This suggests that VLMMs may be applied to makam-based and non-metered musical forms, in addition to Western musical styles. To the best of knowledge, the work presents the first symbolic, machine-readable database and the first application of computational modeling in Turkish folk music.
|
42 |
Music complexity: a multi-faceted description of audio contentStreich, Sebastian 21 February 2007 (has links)
Esta tesis propone un juego de algoritmos que puede emplearse para computar estimaciones de las distintas facetas de complejidad que ofrecen señales musicales auditivas. Están enfocados en los aspectos de acústica, ritmo, timbre y tonalidad. Así pues, la complejidad musical se entiende aquí en el nivel más basto del común acuerdo entre oyentes humanos. El objetivo es obtener juicios de complejidad mediante computación automática que resulten similares al punto de vista de un oyente ingenuo. La motivación de la presente investigación es la de mejorar la interacción humana con colecciones de música digital. Según se discute en la tesis,hay toda una serie de tareas a considerar, como la visualización de una colección, la generación de listas de reproducción o la recomendación automática de música. A través de las estimaciones de complejidad musical provistas por los algoritmos descritos, podemos obtener acceso a un nivel de descripción semántica de la música que ofrecerá novedosas e interesantes soluciones para estas tareas. / This thesis proposes a set of algorithms that can be used to compute estimates of music complexity facets from musical audio signals. They focus on aspects of acoustics, rhythm, timbre, and tonality. Music complexity is thereby considered on the coarse level of common agreement among human listeners. The target is to obtain complexity judgments through automatic computation that resemble a naive listener's point of view. The motivation for the presented research lies in the enhancement of human interaction with digital music collections. As we will discuss, there is a variety of tasks to be considered, such as collection visualization, play-list generation, or the automatic recommendation of music. Through the music complexity estimates provided by the described algorithms we can obtain access to a level of semantic music description, which allows for novel and interesting solutions of these tasks.
|
43 |
Semantic annotation of music collections: A computational approachSordo, Mohamed 27 February 2012 (has links)
El consum de la música ha canviat dràsticament en els últims anys. Amb
l’arribada de la música digital, el cost de producció s’ha reduït considerablement.
L’expansió de la Web ha ajudat a promoure l’exploració de molt més
contingut musical. Algunes botigues musicals on-line, com iTunes o Amazon,
posseeixen milions de cançons a les seves col.leccions. No obstant, accedir a
aquestes col.leccions d’una manera eficient és encara un gran repte.
En aquesta tesis ens centrem en el problema d’anotar col.leccions musicals
amb paraules semàntiques, també conegudes com tags. Els mètodes utilitzats
en aquesta tesi estan fonamentats sobre els camps de recuperació de la
informació, l’inteligència artificial, i el procesament del senyal. Proposem un
algorisme per anotar música automàticament, utilitzant similitud d’audio a
nivell de contingut per propagar tags entre cançons. L’algorisme s’avalua extensament
utilitzant múltiples col.leccions musicals de diferent mida i qualitat
de les dades, incloent una col.lecció de més de mig milió de cançons, anotades
amb tags socials derivats d’una comunitat musical. Avaluem la qualitat del
nostre algorisme mitjançant una comparació amb algorismes de l’estat de l’art.
Addicionalment, discutim la importància d’utilitzar mesures de avaluació que
cobreixen diferents dimensions, és a dir, avaluacions a nivell de cançó i a nivell
de tag. El nostre algorisme ha estat avaluat i s’ha classificat en altes posicions
en el concurs d’avaluació internacional MIREX 2011. Els resultats obtinguts
també demostren algunes limitacions de l’anotació automàtica, relacionades
amb les inconsistències en les dades, la correlació de conceptes i la dificultat
de capturar alguns tags personals amb informació del contingut. Això és més
evident en les comunitats musicals, on els usuaris poden anotar cançons amb
qualsevol paraula, sigui aquesta contextual o no. Per tal d’abordar aquestes
limitacions, presentem un ampli estudi sobre la naturalesa de les folksonomies
musicals. Concretament, estudiem si les anotacions fetes per una gran comunitat
d’usuaris coincideixen amb un vocabulari més controlat i estructurat per
part d’experts en el camp. Els resultats revelen que alguns tags estan clarament
definits i compresos tant des del punt de vista dels experts com el de
la saviesa popular, mentre que n’hi ha d’altres sobre els quals és difícil trobar
un consens. Finalment, estenem el nostre previ treball a un ampli ventall
de conceptes semàntics. Presentem un nou métode per a descobrir conceptes
semàntics implícits en els tags socials, i classificar aquests tags pel que fa als
conceptes semàntics. Les darreres troballes poden ajudar a entendre la naturalesa
dels tags socials, i per tant ser beneficials per a una addicional millora
de la anotació automàtica de la música. / Music consumption has changed drastically in the last few years. With the
arrival of digital music, the cost of production has substantially dropped. The
expansion of the World Wide Web has helped to promote the exploration of
many more music content. Online stores, such as iTunes or Amazon, own music
collections in the order of millions of songs. Accessing these large collections
in an effective manner is still a big challenge.
In this dissertation we focus on the problem of annotating music collections
with semantic words, also called tags. The foundations of all the methods
used in this dissertation are based on techniques from the fields of information
retrieval, machine learning, and signal processing. We propose an automatic
music annotation algorithm that uses content-based audio similarity to propagate
tags among songs. The algorithm is evaluated extensively using multiple
music collections of varying size and quality of the data, including a large music
collection of more than a half million songs, annotated with social tags derived
from a music community. We assess the quality of our proposed algorithm
by comparing it with several state of the art approaches. We also discuss the
importance of using evaluation measures that cover different dimensions; per–
song and per–tag evaluation. Our proposal achieves state of the art results,
and has ranked high in the MIREX 2011 evaluation campaign. The obtained
results also show some limitations of automatic tagging, related to data inconsistencies,
correlation of concepts and the difficulty to capture some personal
tags with content information. This is more evident in music communites,
where users can annotate songs with any free text word. In order to tackle
these issues, we present an in-depth study of the nature of music folksonomies.
We concretely study whether tag annotations made by a large community (i.e.
a folksonomy) correspond with a more controlled, structured vocabulary by
experts in the music and the psychology fields. Results reveal that some tags
are clearly defined and understood both by the experts and the wisdom of
crowds, while it is difficult to achieve a common consensus on the meaning of
other tags. Finally, we extend our previous work to a wide range of semantic
concepts. We present a novel way to uncover facets implicit in social tagging,
and classify the tags with respect to these semantic facets. The latter findings
can help to understand the nature of social tags, and thus be beneficial for
further improvement of semantic tagging of music.
Our findings have significant implications for music information retrieval systems
that assist users to explore large music collections, digging for content
they might like. / El consumo de la música ha cambiado drásticamente en los últimos años. Con
la llegada de la música digital, el coste de producción se ha reducido considerablemente.
La expansión de la Web ha ayudado a promover la exploración de
mucho más contenido musical. Algunas tiendas musicales on-line, como iTunes
o Amazon, poseen millones de canciones en sus colecciones. Sin embargo,
acceder a estas colecciones de una manera eficiente es todavía un gran reto.
En esta tesis nos centramos en el problema de anotar colecciones musicales con
palabras semánticas, también conocidas como tags. Los métodos utilizados en
esta tesis están cimentados sobre los campos de recuperación de la información,
la inteligencia artifical, y el procesamiento del señal. Proponemos un algoritmo
para anotar música automáticamente, usando similitud de audio a nivel de
contenido para propagar tags entre canciones. El algoritmo se evalúa extensamente
usando múltiples colecciones musicales de distinto tamaño y calidad
de los datos, incluyendo una colección de más de medio millón de canciones,
anotadas con tags sociales derivados de una comunidad musical. Evaluamos
la calidad de nuestro algoritmo mediante una comparación con algoritmos del
estado del arte. Adicionalmente, discutimos la importancia de usar medidas de
evaluación que cubren diferentes dimensiones; es decir, evaluaciones a nivel de
canción y a nivel de tag. Nuestro algoritmo ha sido evaluado y se clasificado en
altas posiciones en el concurso de evaluación internacional MIREX 2011. Los
resultados obtenidos también demuestran algunas limitaciones de la anotación
automática, relacionadas con las inconsistencias en los datos, la correlación de
conceptos y la dificultad de capturar algunos tags personales con información
del contenido. Esto es más evidente en las comunidades musicales, donde los
usuarios pueden anotar canciones con cualquier palabra, sea esta contextual o
no. Con el fin de abordar estas limitaciones, presentamos un amplio estudio sobre
la naturaleza de las folksonomías musicales. Concretamente, estudiamos si
las anotaciones hechas por una gran comunidad de usuarios concuerdan con un
vocabulario más controlado y estructurado por parte de expertos en el campo.
Los resultados revelan que algunos tags están claramente definidos y comprendidos
tanto desde el punto de vista de los expertos como el de la sabiduría
popular, mientras que hay otros tags sobre los cuales es difícil encontrar un
consenso. Por último, extendemos nuestro previo trabajo a un amplio abanico
de conceptos semánticos. Presentamos un método novedoso para descubrir
conceptos semánticos implícitos en los tags sociales, y clasificar dichos tags
con respecto a los conceptos semánticos. Los últimos hallazgos pueden ayudar
a entender la naturaleza de los tags sociales, y por consiguiente ser beneficiales
para una adicional mejora para la anotación automática de la música.
|
44 |
In Search of Computer Music Analysis: Music Information Retrieval, Optimization, and Machine Learning from 2000-2016Persaud, Felicia Nafeeza 21 August 2018 (has links)
My thesis aims to critically examine three methods in the current state of Computer Music Analysis. I will concentrate on Music Information Retrieval, Optimization, and Machine Learning. My goal is to describe and critically analyze each method, then examine the intersection of all three. I will start by looking at David Temperley’s The Cognition of Basic Musical Structures (2001) which offers an outline of major accomplishments before the turn of the 21st century. This outline will provide a method of organization for a large portion of the thesis. I will conclude by explaining the most recent developments in terms of the three methods cited. Following trends in these developments, I can hypothesize the direction of the field.
|
45 |
Generative, Discriminative, and Hybrid Approaches to Audio-to-Score Automatic Singing Transcription / 自動歌声採譜のための生成的・識別的・混成アプローチNishikimi, Ryo 23 March 2021 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第23311号 / 情博第747号 / 新制||情||128(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)准教授 吉井 和佳, 教授 河原 達也, 教授 西野 恒, 教授 鹿島 久嗣 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
46 |
Positive unlabeled learning applications in music and healthcareArjannikov, Tom 10 September 2021 (has links)
The supervised and semi-supervised machine learning paradigms hinge on the idea that the training data is labeled. The label quality is often brought into question, and problems related to noisy, inaccurate, or missing labels are studied. One of these is an interesting and prevalent problem in the semi-supervised classification area where only some positive labels are known. At the same time, the remaining and often the majority of the available data is unlabeled, i.e., there are no negative examples. Known as Positive-Unlabeled (PU) learning, this problem has been identified with increasing frequency across many disciplines, including but not limited to health science, biology, bioinformatics, geoscience, physics, business, and politics. Also, there are several closely related machine learning problems, such as cost-sensitive learning and mixture proportion estimation.
This dissertation explores the PU learning problem from the perspective of density estimation and proposes a new modular method compatible with the relabeling framework that is common in PU learning literature. This approach is compared with two existing algorithms throughout the manuscript, one from a seminal work by Elkan and Noto and a current state-of-the-art algorithm by Ivanov. Furthermore, this thesis identifies two machine learning application domains that can benefit from PU learning approaches, which were not previously seen that way: predicting length of stay in hospitals and automatic music tagging. Experimental results with multiple synthetic and real-world datasets from different application domains validate the proposed approach.
Accurately predicting the in-hospital length of stay (LOS) at the time of admission can positively impact healthcare metrics, particularly in novel response scenarios such as the Covid-19 pandemic. During the regular steady-state operation, traditional classification algorithms can be used for this purpose to inform planning and resource management. However, when there are sudden changes to the admission and patient statistics, such as during the onset of a pandemic, these approaches break down because reliable training data becomes available only gradually over time. This thesis demonstrates the effectiveness of PU learning approaches in such situations through experiments by simulating the positive-unlabeled scenario using two fully-labeled publicly available LOS datasets.
Music auto-tagging systems are typically trained using tag labels provided by human listeners. In many cases, this labeling is weak, which means that the provided tags are valid for the associated tracks, but there can be tracks for which a tag would be valid but not present. This situation is analogous to PU learning with the additional complication of being a multi-label scenario. Experimental results on publicly available music datasets with tags representing three different labeling paradigms demonstrate the effectiveness of PU learning techniques in recovering the missing labels and improving auto-tagger performance. / Graduate
|
47 |
The synthesizer programming problem: improving the usability of sound synthesizersShier, Jordie 15 December 2021 (has links)
The sound synthesizer is an electronic musical instrument that has become commonplace in audio production for music, film, television and video games. Despite its widespread use, creating new sounds on a synthesizer - referred to as synthesizer programming - is a complex task that can impede the creative process. The primary aim of this thesis is to support the development of techniques to assist synthesizer users to more easily achieve their creative goals. One of the main focuses is the development and evaluation of algorithms for inverse synthesis, a technique that involves the prediction of synthesizer parameters to match a target sound. Deep learning and evolutionary programming techniques are compared on a baseline FM synthesis problem and a novel hybrid approach is presented that produces high quality results in less than half the computation time of a state-of-the-art genetic algorithm. Another focus is the development of intuitive user interfaces that encourage novice users to engage with synthesizers and learn the relationship between synthesizer parameters and the associated auditory result. To this end, a novel interface (Synth Explorer) is introduced that uses a visual representation of synthesizer sounds on a two-dimensional layout. An additional focus of this thesis is to support further research in automatic synthesizer programming. An open-source library (SpiegeLib) has been developed to support reproducibility, sharing, and evaluation of techniques for inverse synthesis. Additionally, a large-scale dataset of one billion sounds paired with synthesizer parameters (synth1B1) and a GPU-enabled modular synthesizer (torchsynth) are also introduced to support further exploration of the complex relationship between synthesizer parameters and auditory results. / Graduate
|
48 |
A cross-cultural listener-based study on perceptual features in K-pop / En korskulturell lyssnarbaserad studie på perceptuella särdrag i K-popSchön, Ragnar January 2015 (has links)
Recent research within the Music Information Retrieval (MIR) field has shown the relevance of perceptual features for musical signals. The idea is to identify a small set of features that are natural descriptions from a perceptual perspective. The notion of perceptual features is based on the ecological approach to music, that is, focussing on sound events rather than spectral information. Furthermore, MIR research has had an overemphasis on Western music and listeners. This leads to the question of whether the concept of perceptual features is culturally independent or not. This was investigated by having listeners of two distinct cultural backgrounds (Swedish and Chinese) rating a set of eight perceptual features: dissonance, speed, rhythmic complexity, rhythmic clarity, articulation, harmonic complexity, modality and pitch. A culturally specific dataset consisting of Korean pop songs was used to provide the stimuli. This was a subset of a larger set of songs from a previous study selected based on genre and mood annotations to create a diverse dataset. The listener ratings were evaluated by a variety of statistical measures, including cross-correlation and ANOVA. It was found that there was a small but significant difference in the ratings of the perceptual features speed and rhythmic complexity between the two cultural groups.
|
49 |
Digital Humanities in der Musikwissenschaft – Computergestützte Erschließungsstrategien und Analyseansätze für handschriftliche LiedblätterBurghardt, Manuel 03 December 2019 (has links)
Der Beitrag beschreibt ein laufendes Projekt zur computergestützten Erschließung und Analyse einer großen Sammlung handschriftlicher Liedblätter mit Volksliedern aus dem deutschsprachigen Raum. Am Beispiel dieses praktischen Projekts werden Chancen und Herausforderungen diskutiert, die der Einsatz von Digital Humanities-Methoden für den Bereich der Musikwissenschaft mit sich bringt. / This article presents an ongoing project for the computer-based transcription and analysis of handwritten music scores from a large collection of German folk tunes. Based on this project, I will discuss the challenges and opportunities that arise when using Digital Humanities methods in musicology.
|
50 |
A convolutive model for polyphonic instrument identification and pitch detection using combined classificationWeese, Joshua L. January 1900 (has links)
Master of Science / Department of Computing and Information Sciences / William H. Hsu / Pitch detection and instrument identification can be achieved with relatively high accuracy when considering monophonic signals in music; however, accurately classifying polyphonic signals in music remains an unsolved research problem. Pitch and instrument classification is a subset of Music Information Retrieval (MIR) and automatic music transcription, both having numerous research and real-world applications. Several areas of research are covered in this thesis, including the fast Fourier transform, onset detection, convolution, and filtering. Basic music theory and terms are also presented in order to explain the context and structure of data used. The focus of this thesis is on the representation of musical signals in the frequency domain. Polyphonic signals with many different voices and frequencies can be exceptionally complex. This thesis presents a new model for representing the spectral structure of polyphonic signals: Uniform MAx Gaussian Envelope (UMAGE). The new spectral envelope precisely approximates the distribution of frequency parts in the spectrum while still being resilient to oscillating rapidly (noise) and is able to generalize well without losing the representation of the original spectrum. When subjectively compared to other spectral envelope methods, such as the linear predictive coding envelope method and the cepstrum envelope method, UMAGE is able to model high order polyphonic signals without dropping partials (frequencies present in the signal). In other words, UMAGE is able to model a signal independent of the signal’s periodicity. The performance of UMAGE is evaluated both objectively and subjectively. It is shown that UMAGE is robust at modeling the distribution of frequencies in simple and complex polyphonic signals. Combined classification (combiners), a methodology for learning large concepts, is used to simplify the learning process and boost classification results. The output of each learner is then averaged to get the final result. UMAGE is less accurate when identifying pitches; however, it is able to achieve accuracy in identifying instrument groups on order-10 polyphonic signals (ten voices), which is competitive with the current state of the field.
|
Page generated in 0.0993 seconds