Return to search

Audio editing in the time-frequency domain using the Gabor Wavelet Transform

Visualization, processing and editing of audio, directly on a time-frequency surface, is the scope of this thesis. More precisely the scalogram produced by a Gabor Wavelet transform is used, which is a powerful alternative to traditional techinques where the wave form is the main visual aid and editting is performed by parametric filters. Reconstruction properties, scalogram design and enhancements as well audio manipulation algorithms are investigated for this audio representation.The scalogram is designed to allow a flexible choice of time-frequency ratio, while maintaining high quality reconstruction. For this mean, the Loglet is used, which is observed to be the most suitable filter choice.  Re-assignmentare tested, and a novel weighting function using partial derivatives of phase is proposed.  An audio interpolation procedure is developed and shown to perform well in listening tests.The feasibility to use the transform coefficients directly for various purposes is investigated. It is concluded that Pitch shifts are hard to describe in the framework while noise thresh holding works well. A downsampling scheme is suggested that saves on operations and memory consumption as well as it speeds up real world implementations significantly. Finally, a Scalogram 'compression' procedure is developed, allowing the caching of an approximate scalogram.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-153634
Date January 2011
CreatorsHammarqvist, Ulf
PublisherUppsala universitet, Centrum för bildanalys
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUPTEC F, 1401-5757 ; F 11 022

Page generated in 0.0025 seconds