Spelling suggestions: "subject:"cow bit rate"" "subject:"bow bit rate""
1 |
A framework for low bit-rate speech coding in noisy environmentKrishnan, Venkatesh 21 April 2005 (has links)
State of the art model based coders offer a perceptually acceptable reconstructed speech quality at bit-rates as low as 2000 bits per second. However, the performance of these coders rapidly deteriorates below this rate, primarily since very few bits are available to encode the model parameters with high fidelity. This thesis aims to meet the challenge of designing speech coders that operate at lower bit-rates while reconstructing the speech at the receiver at the same or even better quality than state of the art low bit-rate speech coders. In one of the contributions, we develop a plethora of techniques for efficient coding of the parameters obtained by the MELP algorithm, under the assumption that the classification of the frames of the MELP coder is available. Also, a simple and elegant procedure called dynamic codebook reordering is presented for use in the encoders and decoders of a vector quantization system that effectively exploits the correlation between vectors of parameters obtained from consecutiv speech frames without introducing any delay, distortion or suboptimality. The potential of this technique in significantly reducing the bit-rates of speech coders is illustrated. Additionally, the thesis also attempts to address the issues of designing such very low bit-rate speech coders so that they are robust to environmental noise. To impart robustness, a speech enhancement framework employing Kalman filters is presented. Kalman filters designed for speech enhancement in the presence of noise assume an autoregressive model for the speech signal. We improve the performance of Kalman filters in speech enhancement by constraining the parameters of the autoregressive model to belong to a codebook trained on clean speech. We then extend this formulation to the design of a novel framework, called the multiple input Kalman filter, that optimally combines the outputs from several speech enhancement systems. Since the low bit-rate speech coders compress the parameters significantly, it is very important to protect the transmitted information from errors in the communication channel. In this thesis, a novel channel-optimized multi-stage vector quantization codec is presented, in which the stage codebooks are jointly designed.
|
2 |
The Research of Very Low Bit-Rate and Scalable Video Compression Using Cubic-Spline InterpolationWang, Chih-Cheng 18 June 2001 (has links)
This thesis applies the one-dimensional (1-D) and two-dimensional (2-D) cubic-spline interpolation (CSI) schemes to MPEG standard for very low-bit rate video coding. In addition, the CSI scheme is used to implement the scalable video compression scheme in this thesis.
The CSI scheme is based on the least-squares method with a cubic convolution function. It has been shown that the CSI scheme yields a very accurate algorithm for smoothing and obtains a better quality of reconstructed image than linear interpolation, linear-spline interpolation, cubic convolution interpolation, and cubic B-spline interpolation.
In order to obtain a very low-bit rate video, the CSI scheme is used along with the MPEG-1 standard for video coding. Computer simulations show that this modified MPEG not only avoids the blocking effect caused by MPEG at high compression ratio but also gets a very low-bit rate video coding scheme that still maintains a reasonable video quality. Finally, the CSI scheme is also used to achieve the scalable video compression. This new scalable video compression scheme allows the data rate to be dynamically changed by the CSI scheme, which is very useful when operates under communication networks with different transmission capacities.
|
3 |
[en] CONTRIBUITIONS TO IMPROVING CELP CODING AT LOW BIT RATS / [pt] CONTRIBUIÇÕES PARA A MELHORIA DA CODIFICAÇÃO CELP A BAIXAS TAXAS DE BITSLUCIO MARTINS DA SILVA 24 May 2006 (has links)
[pt] Esta tese propõe novas melhorias para a codificação CELP a
baixas taxas de bits. Primeiro, é proposto um algoritmo
CELP em que a complexidade do procedimento de busca no
dicionário adaptativo é grandemente reduzida, graças a uma
modificação introduzida no modelo de síntese CELP.
Resultados de simulação mostram que a qualidade da voz
codificada com o algoritmo CELP proposto tem qualidade
comparável àquela obtida com o algoritmo CELP convencional.
As demais contribuições têm o propósito de melhorar a
qualidade da voz codificada com o algoritmo CELP a baixas
taxas de bits. Uma delas propicia uma codificação mais
eficiente da envoltória espectral LPC da voz: é,
especificamente, um esquema que combina quantização
vetorial e interpolação interbloco dos parâmetros LSF. Com
este esquema a envoltória espectral LPC codificada tem boa
qualidade a uma taxa de bits tão baixa quanto 1 kb/s.
A voz codificada com os algoritmos CELP apresenta
freqüentemente distorções em sua envoltória espectral que
são causadas por deficiências do sinal de excitação. Esta
tese propõe um novo pós-filtro que reduz estas distorções
e, com isso, melhora significativamente a qualidade
subjetiva da voz codificada.
A baixas taxas de bits a estrutura CELP convencional é
incapaz de reproduzir com boa qualidade os ataques dos
sons sonoros, que são cruciais para uma boa percepção da
voz. Nesta tese é descrito um algoritmo CELP que dá
prioridade a estes segmentos críticos. Cada bloco da voz é
classificado em um dentre dezesseis padrões de sonoridade
e cada padrão tem uma configuração de codificação e
alocação de bits distintas. Resultados de simulação
mostram que a qualidade da voz codificada a 4 kb/s com o
algoritmo CELP proposto é significativamente melhor do que
aquela conseguida com um codificador CELP convencional,
também operando a 4 kb/s. / [en] This work presents new improvements to CELP speech coding
at low bit rates. First, a CELP algorithm is proposed in
wich the complexity of the adaptive codebook search is
gratly decreased. This is achieved by means of a modified
model of the CELP synthesizer. Simulation results show
that the proposed algorithm can provide speech quality
comparable to one obtained with the conventional CELP
codec.
The rest of contributions aim to improve the quality of
speech codec at low bit rates with CELP algorithm. One of
them is an efficient scheme for coding the LPC spectral
envelope of speech for coding the LPC spectral envelope of
speech. The proposed scheme combines vector quantization
and interpolation of LSF parameters, and it provides a
coded spectral envelope with very good quality at 1 kb/s.
Speech coded with CELP codecs frequently displays
distortions in its spectral envelope that are produced by
deficient excitation. This thesis proposes a new
postfilter that enhances the perceptual quality of codec
speech by decreasin these distortions.
This work presents new improvements to CELP speech coding
at low bit rates. First, a CELP algorithm is proposed in
wich the complexity of the adaptive codebook search is
gratly decreased. This is achieved by means of a modified
model of the CELP synthesizer. Simulation results show
that the proposed algorithm can provide speech quality
comparable to one obtained with the conventional CELP
codec.
The rest of contributions aim to improve the quality of
speech codec at low bit rates with CELP algorithm. One of
them is an efficient scheme for coding the LPC spectral
envelope of speech for coding the LPC spectral envelope of
speech. The proposed scheme combines vector quantization
and interpolation of LSF parameters, and it provides a
coded spectral envelope with very good quality at 1 kb/s.
Speech coded with CELP codecs frequently displays
distortions in its spectral envelope that are produced by
deficient excitation. This thesis proposes a new
postfilter that enhances the perceptual quality of codec
speech by decreasin these distortions.
Voiced onsets are crucial for a good perception of speech
but, at low bit rates, the conventional CELP is unable to
reproduce them with good quality. This work presents a
CELP algorithm into one of a set of sixteen voicing
patterns. A distinct coding configuration and bit
allocation are applied to each pattern. Simulation results
show that the quality of speech codec with the proposed 4
kb/s CELP codec is significantly bette than the one
obtained with conventional 4 kb/s CELP codec.
|
4 |
A hybrid scheme for low-bit rate stereo image compressionJiang, Jianmin, Edirisinghe, E.A. 29 May 2009 (has links)
No / We propose a hybrid scheme to implement an object driven, block based algorithm to achieve low bit-rate compression of stereo image pairs. The algorithm effectively combines the simplicity and adaptability of the existing block based stereo image compression techniques with an edge/contour based object extraction technique to determine appropriate compression strategy for various areas of the right image. Unlike the existing object-based coding such as MPEG-4 developed in the video compression community, the proposed scheme does not require any additional shape coding. Instead, the arbitrary shape is reconstructed by the matching object inside the left frame, which has been encoded by standard JPEG algorithm and hence made available at the decoding end for those shapes in right frames. Yet the shape reconstruction for right objects incurs no distortion due to the unique correlation between left and right frames inside stereo image pairs and the nature of the proposed hybrid scheme. Extensive experiments carried out support that significant improvements of up to 20% in compression ratios are achieved by the proposed algorithm in comparison with the existing block-based technique, while the reconstructed image quality is maintained at a competitive level in terms of both PSNR values and visual inspections
|
5 |
Système d'animation d'objets virtuels : De la modélisation à la normalisation MPEG-4Preda, Marius 01 December 2002 (has links) (PDF)
Dans le cadre de la nouvelle société de l'information multimédia et communicante, cette thèse propose des contributions méthodologiques et techniques relatives à la représentation, l'animation et la transmission des objets virtuels.<br /><br />Les méthodes existantes sont analysées de façon comparée et les performances des standards multimédias actuels évaluées en termes de réalisme d'animation et de débit de transmission. Pour surmonter les limitations mises en évidence, un nouveau cadre de modélisation et d'animation de personnages virtuels est proposé. Le modèle SMS (Skeleton, Muscle and Skin), fondé sur le concept de contrôleur de déformation d'un maillage, est introduit et sa formulation mathématique développée. Le graphe de scène 3D et le flux de compression associés à SMS sont décrits. L'approche SMS est évaluée dans le cadre d'un nouveau service de transmission télévisuelle d'un signeur virtuel destinés aux déficients auditifs. Le modèle SMS a été promu dans le standard MPEG-4 version 5.
|
6 |
[en] SPEECH CODING AT AVERAGE RATES BELOW 2KB/S / [es] CODIFICACIÓN DE VOZ A TASAS MEDIAS ABAJO DE 2 KB/S / [pt] CODIFICAÇÃO DE VOZ A TAXAS MÉDIAS ABAIXO DE 2 KB/SRODRIGO CAIADO DE LAMARE 21 August 2001 (has links)
[pt] Esta dissertação propõe algoritmos para codificações de voz
a taxas médias em torno de 1,2 Kb/s. Um esquema de
quantização vetorial preditiva chaveada com desempenho
superior aos esquemas previamente descritos na literatura é
proposto e avaliado em canal com ou sem ruído. Detectores
eficientes de período fundamental e de sons oclusivos e
fricativos são examinados e adaptados ao codificador
proposto. Técnicas de exitação a baixas taxas de bits são
investigadas a fim de reproduzir uma boa qualidade de voz
decodificada. O modelo de exitação mista em multi-bandas
com três sub-bandas é adotado para codificar os quadros
sonoros. Para os quadros surdos são empregadas técnicas de
modelagem e síntese de sinais fricativos e oclusivos,
capazes de oferecer qualidade de voz satisfatória,
reduzindo a taxa de bits destes quadros para apenas 0,4
Kb/s. Técnicas de pós-filtragem para reduzir o ruído de
codificação e melhorar a qualidade de voz reconstruída são
também examinadas e comparadas em uma mesma plataforma.
Para reduzir o nível de ruído ambiente são ainda analisados
métodos de supressão de ruído. Finalmente, o codificador
proposto é comparado ao padrão norte-americano Mixed
Excitation Linear Prediction (MELP), por meios de teste de
comparação do tipo A/B. Os testes realizados indicam que o
sistema proposto, operando a 1,2 Kb/s, apresenta qualidade
de voz ligeiramente superior ao MELP, operando a 2,4 Kb/s.
Para situações de transcodificação, o codificador proposto
também apresenta desempenho superior ao MELP. / [en] This dissertation presents algorithms to encode at an
avarage bit rate of 1.2 Kb/s. A novel switched-predictive
vector quantiser technique that outperforms previously
reported schemes is proposed and assessed under noise-free
and noisy channels. Efficient detectors for the pitch
period and fricative and stop sounds are examined and
adapted to the proposed coder. Low bit rate excitation
methods are investigated in order to reproduce rather high
quality speech. A mixed multiband excitation approach with
three sub-bands is employed to encode voiced frames. For
unvoiced frames, fricatives and stops modelling and
synthesis techniques are used. This approach has shown to
provide high quality synthesised speech, whilts it reduces
the bit rate to only 0.4 Kb/s for unvoiced frames. To
reduce coding noise and improve decoded speech, post-
filtering techniques are analysed and compared on the same
plataform. To reduce background noise, noise suppression
methods are also examined. Finally, the propose coder is
evaluated against the North American Mixed Prediction
(MELP) coder, through A/B comparison tests. Assessment
results have shown that the proposed system, operating at
1.2 Kb/s, slightly outperformed the MELP coder, operating
at 2.4 Kb/s. For tandem connection situations, the proposed
algorithm has presented a superior performance than the
MELP coder. / [es] Esta disertación propone algoritmos para codificaciones de voz a tasas medias en torno de 1,2 Kb/s.
Se propone un esquema de cuantización vectorial predictiva, con desempeño superior a los
esquemas previamente descritos en la literatura. Este esquema se evalúa en canal con o sin ruido. Se
examinan detectores eficientes de período fundamental y de sueños oclusivos y fricativos se adaptan
al codificador propuesto. Técnicas de exitación a bajas tasas de bits son investigadas a fin de
reproducir una boa calidad de voz decodificada. Se adopta el modelo de exitación mixta en
multi-bandas con tres sub-bandas para codificar los cuadros sonoros. Para los cuadros surdos se
emplean técnicas de modelación y síntesis de señales fricativos y oclusivos, capaces de ofrecer
calidad de voz satisfactoria, reduciendo la tasa de bits de estos cuadros para apenas 0,4 Kb/s.
También se examinan y se comparan las técnicas de pós-filtragen para reducir el ruido de
codificación y mejorar la calidad de voz reconstruída. Para reducir el nível de ruído ambiente se
analizan métodos de supresión de ruido. Finalmente, el codificador propuesto se compara al padrón
norteamericano Mixed Excitation Lineal Prediction (MELP), por medio de pruebas de comparación
del tipo LA/B. Las pruebas realizadas indican que el sistema propuesto, operando a 1,2 Kb/s, presenta
calidad de voz ligeramente superior al MELP, operando a 2,4 Kb/s. Para situaciones de
transcodificación, el codificador propuesto también presenta desempeño superior al MELP.
|
Page generated in 0.0925 seconds