Enhancing the separated singing voices from harmonic (pitched) and percussive musical instruments in songs recorded with a single microphone is the scope of this thesis. Separating singing voice has applications in music information retrieval systems. Various methods have been used to separate singing voice from harmonic and percussive instruments. Most of them use two stages of separation, one for separating harmonic instruments, and the other for separating percussive instruments. One of these Algorithms uses non-negative matrix factorization in each stage to separate harmonic and percussive instruments. Traditionally, in each stage, components' bases or gains are clustered based on discontinuity measures. The first contribution of this thesis was the use of local discontinuity of significant parts of these bases and gains, followed by splitting (rather than classifying) each component's basis or gain. This significantly refined the separated voice and music sources. Median filtering has also been used in two stages to separate singing voice. Typically, horizontal and vertical filters are used in each stage. The second contribution of this thesis was to enhance the separation quality using a combination of six additional diagonal median filters to accommodate singing voice frequency modulations. In addition, filters parameters that are suitable for all songs regardless of their sampling frequencies are sought. The third contribution of this research was the novel use of Hough Transform to detect traces of pitched instruments in the magnitude spectrogram of the separated voice. These traces are then removed completely using median filtering after successfully calculating their frequency bands. The new Hough Transform based approach was applied to a number of separation algorithms as a post processing step and it significantly improved the quality of the separated voice and music in all of them.
Identifer | oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:764831 |
Date | January 2017 |
Creators | Deif, Hatem |
Contributors | Gan, L. ; Alhashmi, S. |
Publisher | Brunel University |
Source Sets | Ethos UK |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Source | http://bura.brunel.ac.uk/handle/2438/15587 |
Page generated in 0.0015 seconds