Return to search

Score-informed musical source separation and reconstruction

<p>A systematic approach to retrieve individual parts in a monaural music recording with its score is introduced. We are interested in isolating the accompaniment part by removing the solo part from a recording of concerto music in which a solo instrument is accompanied by an orchestra. We require the music audio, the score, and optionally a sample library of individual notes played in isolation. Our approach is based on explicit knowledge of the musical audio at the semantic level (notes or chords) from an audio-score alignment. Such knowledge allows the spectrogram energy to be decomposed into note-based models that could be trained with the sample library. Our approach can be divided into: (1) "masking" to estimate a solo mask to remove the solo and (2) "reconstruction" to impute the missing harmonics of the orchestra notes that have been inevitably damaged in masking. </p><p> In "masking," we estimate a 2-dimensional binary mask to classify each time-frequency cell of the short-time Fourier Transform (STFT) spectrogram as either solo or accompaniment in STFT domain. We mainly employ an Expectation Maximization (EM) algorithm to decompose spectrogram magnitude into note-based models. In this process of "erasing" the soloist&rsquo;s contribution to the mixture by applying the mask, the remaining orchestra is degraded. In "reconstruction," we propose a novel technique to repair such degradation. We use a state-space model for each note partial which is represented by a slowing-changing amplitude envelope and an "unwrapped" phase sequence. Such amplitude-phase representation can be computed by Kalman smoothing. It allows us to "transpose" intact partials of the orchestra part onto the degraded time-frequency region. Objective metrics and subjective listening are used on real and synthesized musical audio data for evaluation and parameter optimization. </p>

Identiferoai:union.ndltd.org:PROQUEST/oai:pqdtoai.proquest.com:3609061
Date26 February 2014
CreatorsHan, Yushen
PublisherIndiana University
Source SetsProQuest.com
LanguageEnglish
Detected LanguageEnglish
Typethesis

Page generated in 0.0018 seconds