• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 6
  • 6
  • 6
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A framework for low bit-rate speech coding in noisy environment

Krishnan, Venkatesh 21 April 2005 (has links)
State of the art model based coders offer a perceptually acceptable reconstructed speech quality at bit-rates as low as 2000 bits per second. However, the performance of these coders rapidly deteriorates below this rate, primarily since very few bits are available to encode the model parameters with high fidelity. This thesis aims to meet the challenge of designing speech coders that operate at lower bit-rates while reconstructing the speech at the receiver at the same or even better quality than state of the art low bit-rate speech coders. In one of the contributions, we develop a plethora of techniques for efficient coding of the parameters obtained by the MELP algorithm, under the assumption that the classification of the frames of the MELP coder is available. Also, a simple and elegant procedure called dynamic codebook reordering is presented for use in the encoders and decoders of a vector quantization system that effectively exploits the correlation between vectors of parameters obtained from consecutiv speech frames without introducing any delay, distortion or suboptimality. The potential of this technique in significantly reducing the bit-rates of speech coders is illustrated. Additionally, the thesis also attempts to address the issues of designing such very low bit-rate speech coders so that they are robust to environmental noise. To impart robustness, a speech enhancement framework employing Kalman filters is presented. Kalman filters designed for speech enhancement in the presence of noise assume an autoregressive model for the speech signal. We improve the performance of Kalman filters in speech enhancement by constraining the parameters of the autoregressive model to belong to a codebook trained on clean speech. We then extend this formulation to the design of a novel framework, called the multiple input Kalman filter, that optimally combines the outputs from several speech enhancement systems. Since the low bit-rate speech coders compress the parameters significantly, it is very important to protect the transmitted information from errors in the communication channel. In this thesis, a novel channel-optimized multi-stage vector quantization codec is presented, in which the stage codebooks are jointly designed.
2

The Research of Very Low Bit-Rate and Scalable Video Compression Using Cubic-Spline Interpolation

Wang, Chih-Cheng 18 June 2001 (has links)
This thesis applies the one-dimensional (1-D) and two-dimensional (2-D) cubic-spline interpolation (CSI) schemes to MPEG standard for very low-bit rate video coding. In addition, the CSI scheme is used to implement the scalable video compression scheme in this thesis. The CSI scheme is based on the least-squares method with a cubic convolution function. It has been shown that the CSI scheme yields a very accurate algorithm for smoothing and obtains a better quality of reconstructed image than linear interpolation, linear-spline interpolation, cubic convolution interpolation, and cubic B-spline interpolation. In order to obtain a very low-bit rate video, the CSI scheme is used along with the MPEG-1 standard for video coding. Computer simulations show that this modified MPEG not only avoids the blocking effect caused by MPEG at high compression ratio but also gets a very low-bit rate video coding scheme that still maintains a reasonable video quality. Finally, the CSI scheme is also used to achieve the scalable video compression. This new scalable video compression scheme allows the data rate to be dynamically changed by the CSI scheme, which is very useful when operates under communication networks with different transmission capacities.
3

[en] CONTRIBUITIONS TO IMPROVING CELP CODING AT LOW BIT RATS / [pt] CONTRIBUIÇÕES PARA A MELHORIA DA CODIFICAÇÃO CELP A BAIXAS TAXAS DE BITS

LUCIO MARTINS DA SILVA 24 May 2006 (has links)
[pt] Esta tese propõe novas melhorias para a codificação CELP a baixas taxas de bits. Primeiro, é proposto um algoritmo CELP em que a complexidade do procedimento de busca no dicionário adaptativo é grandemente reduzida, graças a uma modificação introduzida no modelo de síntese CELP. Resultados de simulação mostram que a qualidade da voz codificada com o algoritmo CELP proposto tem qualidade comparável àquela obtida com o algoritmo CELP convencional. As demais contribuições têm o propósito de melhorar a qualidade da voz codificada com o algoritmo CELP a baixas taxas de bits. Uma delas propicia uma codificação mais eficiente da envoltória espectral LPC da voz: é, especificamente, um esquema que combina quantização vetorial e interpolação interbloco dos parâmetros LSF. Com este esquema a envoltória espectral LPC codificada tem boa qualidade a uma taxa de bits tão baixa quanto 1 kb/s. A voz codificada com os algoritmos CELP apresenta freqüentemente distorções em sua envoltória espectral que são causadas por deficiências do sinal de excitação. Esta tese propõe um novo pós-filtro que reduz estas distorções e, com isso, melhora significativamente a qualidade subjetiva da voz codificada. A baixas taxas de bits a estrutura CELP convencional é incapaz de reproduzir com boa qualidade os ataques dos sons sonoros, que são cruciais para uma boa percepção da voz. Nesta tese é descrito um algoritmo CELP que dá prioridade a estes segmentos críticos. Cada bloco da voz é classificado em um dentre dezesseis padrões de sonoridade e cada padrão tem uma configuração de codificação e alocação de bits distintas. Resultados de simulação mostram que a qualidade da voz codificada a 4 kb/s com o algoritmo CELP proposto é significativamente melhor do que aquela conseguida com um codificador CELP convencional, também operando a 4 kb/s. / [en] This work presents new improvements to CELP speech coding at low bit rates. First, a CELP algorithm is proposed in wich the complexity of the adaptive codebook search is gratly decreased. This is achieved by means of a modified model of the CELP synthesizer. Simulation results show that the proposed algorithm can provide speech quality comparable to one obtained with the conventional CELP codec. The rest of contributions aim to improve the quality of speech codec at low bit rates with CELP algorithm. One of them is an efficient scheme for coding the LPC spectral envelope of speech for coding the LPC spectral envelope of speech. The proposed scheme combines vector quantization and interpolation of LSF parameters, and it provides a coded spectral envelope with very good quality at 1 kb/s. Speech coded with CELP codecs frequently displays distortions in its spectral envelope that are produced by deficient excitation. This thesis proposes a new postfilter that enhances the perceptual quality of codec speech by decreasin these distortions. This work presents new improvements to CELP speech coding at low bit rates. First, a CELP algorithm is proposed in wich the complexity of the adaptive codebook search is gratly decreased. This is achieved by means of a modified model of the CELP synthesizer. Simulation results show that the proposed algorithm can provide speech quality comparable to one obtained with the conventional CELP codec. The rest of contributions aim to improve the quality of speech codec at low bit rates with CELP algorithm. One of them is an efficient scheme for coding the LPC spectral envelope of speech for coding the LPC spectral envelope of speech. The proposed scheme combines vector quantization and interpolation of LSF parameters, and it provides a coded spectral envelope with very good quality at 1 kb/s. Speech coded with CELP codecs frequently displays distortions in its spectral envelope that are produced by deficient excitation. This thesis proposes a new postfilter that enhances the perceptual quality of codec speech by decreasin these distortions. Voiced onsets are crucial for a good perception of speech but, at low bit rates, the conventional CELP is unable to reproduce them with good quality. This work presents a CELP algorithm into one of a set of sixteen voicing patterns. A distinct coding configuration and bit allocation are applied to each pattern. Simulation results show that the quality of speech codec with the proposed 4 kb/s CELP codec is significantly bette than the one obtained with conventional 4 kb/s CELP codec.
4

A hybrid scheme for low-bit rate stereo image compression

Jiang, Jianmin, Edirisinghe, E.A. 29 May 2009 (has links)
No / We propose a hybrid scheme to implement an object driven, block based algorithm to achieve low bit-rate compression of stereo image pairs. The algorithm effectively combines the simplicity and adaptability of the existing block based stereo image compression techniques with an edge/contour based object extraction technique to determine appropriate compression strategy for various areas of the right image. Unlike the existing object-based coding such as MPEG-4 developed in the video compression community, the proposed scheme does not require any additional shape coding. Instead, the arbitrary shape is reconstructed by the matching object inside the left frame, which has been encoded by standard JPEG algorithm and hence made available at the decoding end for those shapes in right frames. Yet the shape reconstruction for right objects incurs no distortion due to the unique correlation between left and right frames inside stereo image pairs and the nature of the proposed hybrid scheme. Extensive experiments carried out support that significant improvements of up to 20% in compression ratios are achieved by the proposed algorithm in comparison with the existing block-based technique, while the reconstructed image quality is maintained at a competitive level in terms of both PSNR values and visual inspections
5

Système d'animation d'objets virtuels : De la modélisation à la normalisation MPEG-4

Preda, Marius 01 December 2002 (has links) (PDF)
Dans le cadre de la nouvelle société de l'information multimédia et communicante, cette thèse propose des contributions méthodologiques et techniques relatives à la représentation, l'animation et la transmission des objets virtuels.<br /><br />Les méthodes existantes sont analysées de façon comparée et les performances des standards multimédias actuels évaluées en termes de réalisme d'animation et de débit de transmission. Pour surmonter les limitations mises en évidence, un nouveau cadre de modélisation et d'animation de personnages virtuels est proposé. Le modèle SMS (Skeleton, Muscle and Skin), fondé sur le concept de contrôleur de déformation d'un maillage, est introduit et sa formulation mathématique développée. Le graphe de scène 3D et le flux de compression associés à SMS sont décrits. L'approche SMS est évaluée dans le cadre d'un nouveau service de transmission télévisuelle d'un signeur virtuel destinés aux déficients auditifs. Le modèle SMS a été promu dans le standard MPEG-4 version 5.
6

[en] SPEECH CODING AT AVERAGE RATES BELOW 2KB/S / [es] CODIFICACIÓN DE VOZ A TASAS MEDIAS ABAJO DE 2 KB/S / [pt] CODIFICAÇÃO DE VOZ A TAXAS MÉDIAS ABAIXO DE 2 KB/S

RODRIGO CAIADO DE LAMARE 21 August 2001 (has links)
[pt] Esta dissertação propõe algoritmos para codificações de voz a taxas médias em torno de 1,2 Kb/s. Um esquema de quantização vetorial preditiva chaveada com desempenho superior aos esquemas previamente descritos na literatura é proposto e avaliado em canal com ou sem ruído. Detectores eficientes de período fundamental e de sons oclusivos e fricativos são examinados e adaptados ao codificador proposto. Técnicas de exitação a baixas taxas de bits são investigadas a fim de reproduzir uma boa qualidade de voz decodificada. O modelo de exitação mista em multi-bandas com três sub-bandas é adotado para codificar os quadros sonoros. Para os quadros surdos são empregadas técnicas de modelagem e síntese de sinais fricativos e oclusivos, capazes de oferecer qualidade de voz satisfatória, reduzindo a taxa de bits destes quadros para apenas 0,4 Kb/s. Técnicas de pós-filtragem para reduzir o ruído de codificação e melhorar a qualidade de voz reconstruída são também examinadas e comparadas em uma mesma plataforma. Para reduzir o nível de ruído ambiente são ainda analisados métodos de supressão de ruído. Finalmente, o codificador proposto é comparado ao padrão norte-americano Mixed Excitation Linear Prediction (MELP), por meios de teste de comparação do tipo A/B. Os testes realizados indicam que o sistema proposto, operando a 1,2 Kb/s, apresenta qualidade de voz ligeiramente superior ao MELP, operando a 2,4 Kb/s. Para situações de transcodificação, o codificador proposto também apresenta desempenho superior ao MELP. / [en] This dissertation presents algorithms to encode at an avarage bit rate of 1.2 Kb/s. A novel switched-predictive vector quantiser technique that outperforms previously reported schemes is proposed and assessed under noise-free and noisy channels. Efficient detectors for the pitch period and fricative and stop sounds are examined and adapted to the proposed coder. Low bit rate excitation methods are investigated in order to reproduce rather high quality speech. A mixed multiband excitation approach with three sub-bands is employed to encode voiced frames. For unvoiced frames, fricatives and stops modelling and synthesis techniques are used. This approach has shown to provide high quality synthesised speech, whilts it reduces the bit rate to only 0.4 Kb/s for unvoiced frames. To reduce coding noise and improve decoded speech, post- filtering techniques are analysed and compared on the same plataform. To reduce background noise, noise suppression methods are also examined. Finally, the propose coder is evaluated against the North American Mixed Prediction (MELP) coder, through A/B comparison tests. Assessment results have shown that the proposed system, operating at 1.2 Kb/s, slightly outperformed the MELP coder, operating at 2.4 Kb/s. For tandem connection situations, the proposed algorithm has presented a superior performance than the MELP coder. / [es] Esta disertación propone algoritmos para codificaciones de voz a tasas medias en torno de 1,2 Kb/s. Se propone un esquema de cuantización vectorial predictiva, con desempeño superior a los esquemas previamente descritos en la literatura. Este esquema se evalúa en canal con o sin ruido. Se examinan detectores eficientes de período fundamental y de sueños oclusivos y fricativos se adaptan al codificador propuesto. Técnicas de exitación a bajas tasas de bits son investigadas a fin de reproducir una boa calidad de voz decodificada. Se adopta el modelo de exitación mixta en multi-bandas con tres sub-bandas para codificar los cuadros sonoros. Para los cuadros surdos se emplean técnicas de modelación y síntesis de señales fricativos y oclusivos, capaces de ofrecer calidad de voz satisfactoria, reduciendo la tasa de bits de estos cuadros para apenas 0,4 Kb/s. También se examinan y se comparan las técnicas de pós-filtragen para reducir el ruido de codificación y mejorar la calidad de voz reconstruída. Para reducir el nível de ruído ambiente se analizan métodos de supresión de ruido. Finalmente, el codificador propuesto se compara al padrón norteamericano Mixed Excitation Lineal Prediction (MELP), por medio de pruebas de comparación del tipo LA/B. Las pruebas realizadas indican que el sistema propuesto, operando a 1,2 Kb/s, presenta calidad de voz ligeramente superior al MELP, operando a 2,4 Kb/s. Para situaciones de transcodificación, el codificador propuesto también presenta desempeño superior al MELP.

Page generated in 0.0897 seconds