This thesis is an investigation into how the signature quadratic form distance can be used to search in music.Using the method used for images by Beecks, Uysal and Seidl as a starting point,I create feature signatures from sound clips by clustering features from their frequency representations.I compare three different feature types, based on Fourier coefficients, mel frequency cepstrum coefficients (MFCCs), and the chromatic scale.Two search applications are considered.First, an audio fingerprinting system, where a music file is located by a short recorded clip from the song.I run experiments to see how the system's parameters affect the search quality, and show that it achieves some robustness to noise in the queries, though less so that comparable state-of-the-art methods.Second, a query-by-humming system where humming or singing by one user is used to search in humming/singing by other users.Here none of the tested feature types achieve satisfactory search performance. I identify and discuss some possible limitations of the selected feature types for this task.I believe that this thesis serves to demonstrate the versatility of the feature clustering approach, and may serve as a starting point for further research.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:ntnu-14472 |
Date | January 2011 |
Creators | Hitland, HÃ¥kon Haugdal |
Publisher | Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap, Institutt for datateknikk og informasjonsvitenskap |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0022 seconds