This thesis presents a spatial sound rendering system for the use in immersive virtual environments. Spatial sound rendering aims at artificially reproducing the acoustics of a space. It has many applications such as music production, movies, electronic gaming and teleconferencing. Conventionally, spatial sound rendering is implemented by digital signal processing algorithms derived from perceptual models or simplified physical models. While being flexible and/or efficient, these models are not able to capture the acoustical impression of a space faithfully. On the other side, convolving the sound sources with properly measured impulse responses produces the highest possible fidelity, but it is not practically useful for many applications because one impulse response corresponds to one source/listener configuration so that the sources or the listeners can not be relocated.
In this thesis, techniques for measuring multichannel room impulse responses (MMRIR) are reviewed. Then, methods for analyzing measured MMRIR and rendering virtual acoustical environment based on such analysis are presented and evaluated. The analysis can be performed off-line. During this stage, a set of filters that represent the characteristics of the air and walls inside the acoustic space are obtained. Based on the assumption that the MMRIR acquired at one "good" position in the target space can be used to simulate the late reverb at other positions in the same space, appropriate segments that can be used as reverb tails are extracted from the measured MMRIR. The rendering system first constructs an early reflection model based on the positions of the listener-source pair and the filters derived, then combines with the late reverb segments to form a complete listener-source-room acoustical model that can be used to synthesize high quality multi-channel audio for arbitrary listener-source positions. Another merit of the proposed framework is that it is scalable. At the expense of slightly degraded rendering quality, the computational complexity can be greatly reduced. This makes this framework suitable for a wide range of applications that have different quality and complexity requirements.
The proposed framework has been evaluated by formal listening tests. These tests have proven the effectiveness in preserving the spatial quality while positioning the listener-source pair accurately, as well as justified the key assumptions made by the proposed system.
Identifer | oai:union.ndltd.org:uvic.ca/oai:dspace.library.uvic.ca:1828/2961 |
Date | 24 August 2010 |
Creators | Li, Yan |
Contributors | Driessen, Peter F., Tzanetakis, George |
Source Sets | University of Victoria |
Language | English, English |
Detected Language | English |
Type | Thesis |
Rights | Available to the World Wide Web |
Page generated in 0.0019 seconds