Global ETD Search

11	Advanced linear predictive speech compression at 3.0 kbits/sec and below Atkinson, Ian Andrew January 1997 (has links) No description available. 621.3822
12	Multi-band excitation based vocoders and their real-time implementation Ma, Wei January 1994 (has links) No description available. 621.3822 Speech coding; Digital signal processing
13	16Kb/s APC and 9.6Kb/s RELP for satellite mobile systems Zarkadis, D. J. January 1987 (has links) No description available. 621.3822 Digital speech coding][Radio systems
14	Novel Pitch Detection Algorithm With Application to Speech Coding Kura, Vijay 19 December 2003 (has links) This thesis introduces a novel method for accurate pitch detection and speech segmentation, named Multi-feature, Autocorrelation (ACR) and Wavelet Technique (MAWT). MAWT uses feature extraction, and ACR applied on Linear Predictive Coding (LPC) residuals, with a wavelet-based refinement step. MAWT opens the way for a unique approach to modeling: although speech is divided into segments, the success of voicing decisions is not crucial. Experiments demonstrate the superiority of MAWT in pitch period detection accuracy over existing methods, and illustrate its advantages for speech segmentation. These advantages are more pronounced for gain-varying and transitional speech, and under noisy conditions. fundamental frequency speech coding wavelets and linear predictive coding
15	A framework for low bit-rate speech coding in noisy environment Krishnan, Venkatesh 21 April 2005 (has links) State of the art model based coders offer a perceptually acceptable reconstructed speech quality at bit-rates as low as 2000 bits per second. However, the performance of these coders rapidly deteriorates below this rate, primarily since very few bits are available to encode the model parameters with high fidelity. This thesis aims to meet the challenge of designing speech coders that operate at lower bit-rates while reconstructing the speech at the receiver at the same or even better quality than state of the art low bit-rate speech coders. In one of the contributions, we develop a plethora of techniques for efficient coding of the parameters obtained by the MELP algorithm, under the assumption that the classification of the frames of the MELP coder is available. Also, a simple and elegant procedure called dynamic codebook reordering is presented for use in the encoders and decoders of a vector quantization system that effectively exploits the correlation between vectors of parameters obtained from consecutiv speech frames without introducing any delay, distortion or suboptimality. The potential of this technique in significantly reducing the bit-rates of speech coders is illustrated. Additionally, the thesis also attempts to address the issues of designing such very low bit-rate speech coders so that they are robust to environmental noise. To impart robustness, a speech enhancement framework employing Kalman filters is presented. Kalman filters designed for speech enhancement in the presence of noise assume an autoregressive model for the speech signal. We improve the performance of Kalman filters in speech enhancement by constraining the parameters of the autoregressive model to belong to a codebook trained on clean speech. We then extend this formulation to the design of a novel framework, called the multiple input Kalman filter, that optimally combines the outputs from several speech enhancement systems. Since the low bit-rate speech coders compress the parameters significantly, it is very important to protect the transmitted information from errors in the communication channel. In this thesis, a novel channel-optimized multi-stage vector quantization codec is presented, in which the stage codebooks are jointly designed. Noise robustness Low bit-rate Digital signal processing Speech coding
16	[en] ANALYSIS OF WAVEFORM CODERS FOR SPEECH AND DATA SIGNALS / [pt] ANÁLISE DE CODIFICADORES DE FORMA DE ONDA PARA SINAIS DE VOZ E DADOS ANTONIO MARCOS DE LIMA ARAUJO 07 November 2006 (has links) [pt] O trabalho examina o comportamento de Codificadores de forma de onda operando a 32,56 e 64kbit/s para transmissão digital de sinais de voz e de sinais de dados PSK-8 a 4800 bit/s e QAM-16 a 9600 bit/s. A partir de uma análise detalhada dos diversos sistemas, tanto em canal ideal como um canal ruidoso, é verificada a necessidade de se fazer uma identificação do tipo de sinal. De modo a permitir sua codificação de forma mais eficiente. É, então, proposta e avaliada a utilização de uma técnica de identificação estatística de sinais de voz e dados, em codificadores de forma de onda. A incorporação desta técnica ao sistema ADPCM a 32 kbit/s recomendado pelo CCITT permite uma melhoria do desempenho para sinais de dados, sem com isso alterar sua eficiência para sinais de voz. / [en] This thesis evaluates the performance of waveform coders at 32,56 and 64kbit/s for digital transmission of speech signal and 4800 bit/s PSK-8 and 9600 bit/s QAM-16 voiceband data signas. A detailed analysis of the systems is carried out both under ideal and noisy channel conditions. From this analysis it was found that a scheme which accurately distinguishes the two classes of signals, would allow a more efficient encoding procedure. A method of statistical identification of speech and data signals is proposed and its use in wakeform coders is, then, analysed. The incorporation of this method into the 32 kbit/s ADPCM system recommended by CCITT provides an improvement in performance for data signals, without sacrificing its efficiency for speech signal. [pt] CODIFICACAO DE VOZ [en] SPEECH CODING [pt] TRANSMISSAO DIGITAL [en] DIGITAL TRANSMISSION
17	[en] STUDY ON SPEECH CODING IN SUB-BANDS AT 16 KBITS/S / [pt] ESTUDO DE CODIFICAÇÃO DE VOZ EM SUB-BANDAS A 16 KBIT/S CARLOS FELIPE DE BRITO JACCOUD 09 November 2006 (has links) [pt] Neste trabalho são estudados sistemas de codificação digital de sinais de voz em sub-bandas, operando na taxa de 16 kbit/s. Os sistemas são analisados em função do número de sub-bandas, dos esquemas de codificação utilizados nas diversas sub-bandas, dos parâmetros e das técnicas empregados na adaptação dos quantizadores e do tipo de alocação dos recursos binários. A decomposição espectral do sinal é realizada a partir de bancos de filtros espelhados em quadratura (QMF), que tem a vantagem de evitar a sobreposição dos espectros nas diversas sub- bandas. Os sistemas são avaliados, através de simulação em computador, tanto em canal ideal como em um canal ruidoso, utilizando como figura de mérito a razão sinal-ruído frequencial. O desempenho obtido em termos das razões sinal-ruído global e segmentada também é apresentado em todos os casos analisados. A partir de um exame detalhado dos diversos sistemas e proposta uma estrutura de codificação em que a configuração do quantizador e sua técnica de adaptação dependem da sub-banda a ser codificada. Além disso, o sistema proposto utiliza um algoritmo para alocação dos recursos binários baseado em comparações de energias de blocos de amostras das diversas sub-bandas. / [en] In this thesis a study of 16 kbit/s sub-band coding of speech signals is presented. The encoding systems are examined in terms of the number of sub-bands, the schemes used to encode the sub-band signals, the parameters and techniques employed in the quantizers adaptation and the type of bit allocation. Spectral decomposition of the input signal is carried out by means of quadrature mirror filter (QMF) banks. The systems are evaluated through computer simulation, both in ideal and noisy channels, using the frequency - weighted signal-to-noise ratio as the performance criterion. The global and segmental signal- to-noise ratio performances are also given in all cases. After a detailed examination of the various systems, a coding structure is proposed , in which the configuration of the quantizer and its adaptation techniques depend on the sub-band to be coded. Furthermore, the proposed system uses an algorithm for bit allocation based on comparison of the energies fo blocks of samples in the several sub-bands. [pt] CODIFICACAO DE VOZ [en] SPEECH CODING [pt] TRANSMISSAO DIGITAL [en] DIGITAL TRANSMISSION
18	Exploiting spatial and temporal redundancies for vector quantization of speech and images Meh Chu, Chu 07 January 2016 (has links) The objective of the proposed research is to compress data such as speech, audio, and images using a new re-ordering vector quantization approach that exploits the transition probability between consecutive code vectors in a signal. Vector quantization is the process of encoding blocks of samples from a data sequence by replacing every input vector from a dictionary of reproduction vectors. Shannon’s rate-distortion theory states that signals encoded as blocks of samples have a better rate-distortion performance relative to when encoded on a sample-to-sample basis. As such, vector quantization achieves a lower coding rate for a given distortion relative to scalar quantization for any given signal. Vector quantization does not take advantage of the inter-vector correlation between successive input vectors in data sequences. It has been demonstrated that real signals have significant inter-vector correlation. This correlation has led to vector quantization approaches that encode input vectors based on previously encoded vectors. Some methods have been proposed in literature to exploit the dependence between successive code vectors. Predictive vector quantization, dynamic codebook re-ordering, and finite-state vector quantization are examples of vector quantization schemes that use intervector correlation. Predictive vector quantization and finite-state vector quantization predict the reproduction vector for a given input vector by using past input vectors. Dynamic codebook re-ordering vector quantization has the same reproduction vectors as standard vector quantization. The dynamic codebook re-ordering algorithm is based on the concept of re-ordering indices whereby existing reproduction vectors are assigned new channel indices according a structure that orders the reproduction vectors in an order of increasing dissimilarity. Hence, an input vector encoded in the standard vector quantization method is transmitted through a channel with new indices such that 0 is assigned to the closest reproduction vector to the past reproduction vector. Larger index values are assigned to reproduction vectors that have larger distances from the previous reproduction vector. Dynamic codebook re-ordering assumes that the reproduction vectors of two successive vectors of real signals are typically close to each other according to a distance metric. Sometimes, two successively encoded vectors may have relatively larger distances from each other. Our likelihood codebook re-ordering vector quantization algorithm exploits the structure within a signal by exploiting the non-uniformity in the reproduction vector transition probability in a data sequence. Input vectors that have higher probability of transition from prior reproduction vectors are assigned indices of smaller values. The code vectors that are more likely to follow a given vector are assigned indices closer to 0 while the less likely are given assigned indices of higher value. This re-ordering provides the reproduction dictionary a structure suitable for entropy coding such as Huffman and arithmetic coding. Since such transitions are common in real signals, it is expected that our proposed algorithm when combined with entropy coding algorithms such binary arithmetic and Huffman coding, will result in lower bit rates for the same distortion as a standard vector quantization algorithm. The re-ordering vector quantization approach on quantized indices can be useful in speech, images, audio transmission. By applying our re-ordering approach to these data types, we expect to achieve lower coding rates for a given distortion or perceptual quality. This reduced coding rate makes our proposed algorithm useful for transmission and storage of larger image, speech streams for their respective communication channels. The use of truncation on the likelihood codebook re-ordering scheme results in much lower compression rates without significantly distorting the perceptual quality of the signals. Today, texts and other multimedia signals may be benefit from this additional layer of likelihood re-ordering compression. Vector quantization Speech coding Image coding Source coding Codebook reordering Signal processing
19	Voice Codec for Floating Point Processor Ross, Johan, Engström, Hans January 2008 (has links) <p>As part of an ongoing project at the department of electrical engineering, ISY, at Linköping University, a voice decoder using floating point formats has been the focus of this master thesis. Previous work has been done developing an mp3-decoder using the floating point formats. All is expected to be implemented on a single DSP.The ever present desire to make things smaller, more efficient and less power consuming are the main reasons for this master thesis regarding the use of a floating point format instead of the traditional integer format in a GSM codec. The idea with the low precision floating point format is to be able to reduce the size of the memory. This in turn reduces the size of the total chip area needed and also decreases the power consumption.One main question is if this can be done with the floating point format without losing too much sound quality of the speech. When using the integer format, one can represent every value in the range depending on how many bits are being used. When using a floating point format you can represent larger values using fewer bits compared to the integer format but you lose representation of some values and have to round the values off.From the tests that have been made with the decoder during this thesis, it has been found that the audible difference between the two formats is very small and can hardly be heard, if at all. The rounding seems to have very little effect on the quality of the sound and the implementation of the codec has succeeded in reproducing similar sound quality to the GSM standard decoder.</p> Voice codec floating point GSM decoder low precision codec speech coding TECHNOLOGY TEKNIKVETENSKAP
20	Voice Codec for Floating Point Processor Ross, Johan, Engström, Hans January 2008 (has links) As part of an ongoing project at the department of electrical engineering, ISY, at Linköping University, a voice decoder using floating point formats has been the focus of this master thesis. Previous work has been done developing an mp3-decoder using the floating point formats. All is expected to be implemented on a single DSP.The ever present desire to make things smaller, more efficient and less power consuming are the main reasons for this master thesis regarding the use of a floating point format instead of the traditional integer format in a GSM codec. The idea with the low precision floating point format is to be able to reduce the size of the memory. This in turn reduces the size of the total chip area needed and also decreases the power consumption.One main question is if this can be done with the floating point format without losing too much sound quality of the speech. When using the integer format, one can represent every value in the range depending on how many bits are being used. When using a floating point format you can represent larger values using fewer bits compared to the integer format but you lose representation of some values and have to round the values off.From the tests that have been made with the decoder during this thesis, it has been found that the audible difference between the two formats is very small and can hardly be heard, if at all. The rounding seems to have very little effect on the quality of the sound and the implementation of the codec has succeeded in reproducing similar sound quality to the GSM standard decoder. Voice codec floating point GSM decoder low precision codec speech coding TECHNOLOGY TEKNIKVETENSKAP

Search results