Return to search

Factors affecting the quality of linear predictive coding of speech at low bit-rates

This thesis aims to examine those factors which affect the quality and performance of low bit-rate coding algorithms for speech, based on linear prediction, operating between 4-16kb/s. While coding algorithms at 64kb/s and 32kb/s are now accepted CCITT standards, and a similar standard will be shortly adopted at 16kb/s, speech coding systems operating below these rates are not yet in wide-spread use, except for one or two specific systems such as GSM. Yet low bit-rate digital speech systems will become an essential part of many of the proposed mobile networks, based on both cellular and satellite technology. Of several possible candidates for low bit-rate applications, it is linear predictive coders that appear to offer the best in terms of quality and efficiency, and many developments, based on linear prediction, have been reported in the literature over the past twenty years. What is less clear is whether there is the potential for linear predictive coders to be developed further with better quality at even lower rates. This thesis sets out to examine some of those issues. The first part of the thesis develops a general theory for speech coding in terms of a hierarchical model of speech communication and identifies a dual function in the redundancies that exist at each layer of the hierarchical structure. The operation of linear predictive coding, in terms of this model is described, and it is shown that the limits to performance are determined by the ability of the encoder to efficiently transfer communication from a lower to a higher level in the hierarchy. The thesis then turns its attention towards the specific performance of linear prediction analysis on speech signals. It is shown that there is a limit to the performance that can be obtained with conventional linear prediction analysis due to the assumptions upon which the theory of linear prediction is based. A range of sub-classes of linear predictive coder are then compared in terms of the general model and the analysis procedures in the encoder stage are identified as being the key to coder performance. The central part of this thesis examines, specifically, a range of pitch determination algorithms which may be employed to enable accurate extraction of pitch correlations from the speech signal. A number of candidates are identified and compared. An investigation into the robustness of these algorithms to noisy speech is presented and a new highly robust algorithm is described. Finally, an investigation into robust linear prediction is reported. This falls into two parts - the performance of linear prediction on noisy speech and the performance of linear prediction during voiced speech. A range of methods for improving linear prediction during voiced speech are compared and the recently proposed method of Lee is examined in depth. Results of the application of Lee's method to speech coding is given and an improved version of the algorithm is described.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:280346
Date January 1990
CreatorsChilton, Edward
PublisherUniversity of Surrey
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://epubs.surrey.ac.uk/843568/

Page generated in 1.0394 seconds