Return to search

Combined speech and audio coding with bit rate and bandwidth scalability

The past two decades have witnessed a rapid expansion within the telecommunications industry. This growth has been primarily motivated by the proliferation of digital communication systems and services which have become easily available through wired and wireless systems. Current research trends involve the integration of speech, audio, video and data channels into true multimedia communications over fixed and mobile networks. However, while the available bandwidth in wired terrestrial networks is relatively cheap and expandable, it becomes a limited resource in satellite and cellular-radio systems. In order to accommodate an ever growing number of users while maintaining high quality and low operational costs, it is necessary to maximise spectral efficiency. This has given rise to the development of high rate compression techniques with the ability to adapt to a broad class of input signals and to varying network resources. The research carried out in this thesis has mainly focused on the design of a single algorithm for compressing speech and audio signals sampled at different rates. The algorithms are based on the analysis-by-synthesis linear prediction coding (AbS-LPC) scheme, which has been widely employed in various speech coding standards. However, this bit rate reduction technique is based on the speech production mechanism and as such provides a rigid structure which presents a major limitation for audio coding. In order to improve the audio quality at low rates and to compensate for the errors incurred by the linear prediction during segments of high transitions, the algorithms employ an efficient pulse excitation structure which represents the short innovation sequences with sparse unit magnitude pulses. The scheme proposed for the compression of telephone bandwidth speech and audio signals at 12kb/s achieves similar quality to the G.728 coder at 16kb/s and higher audio quality than the GSM-EFR standard at 12.2kb/s. Wideband speech and audio coding schemes have been designed using both the fullband approach at bit rates of 17 and 19kb/s and also the split band technique at a bit rate of 20kb/s. The perceptual quality is comparable to the G.722 coder operating at 48kb/s. The subband decomposition technique is also adapted to code speech and audio signals sampled at 32kHz. The quality of the coder at 28kb/s is similar to the quality achieved by the MP3 coder at 32kb/s. The algorithm also provides bandwidth and bit rate scalability ranging from 12 to 64kb/s, making it ideal for deployment in rate-adaptive communication systems.
Date January 2001
CreatorsFarrugia, Maria
PublisherUniversity of Surrey
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation

Page generated in 0.244 seconds