• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • Tagged with
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Stereo coding for the ITU-T G.719 codec

Jansson, Tomas January 2011 (has links)
This thesis presents a stereo coding architecture for the ITU-T G.719 fullband mono codec. G.719 is suitable for teleconferencing applications with a competitive audio quality for speech and audio signals that are encoded at 32, 48 and 64 kbps. The proposed stereo architecture comprises parametric stereo coding where the spatial properties of the stereo channels are modeled with the use of parameters, which are encoded and transmitted to the decoder together with an encoded downmix of the stereo channels. The stereo architecture has been implemented in MATLAB with an external mono coding using a floating point ANSI-C implementation of the ITU-T G.719 codec. Two parametric stereo models have been implemented in a framework operating in the complex-valued Modified Discrete Fourier Transform (MDFT) domain. The first model is based on the inter-channel cues that represent level differences, time differences and coherences between the stereo channels. The cues approximate the corresponding interaural cues that characterize our localization of sound in space. The second model is based on the Karhunen-Loève Transform (KLT) with the associated rotation angles, the inter-channel time differences and the residual scaling parameters. An improved MDFT domain extraction of the inter-channel time difference between the stereo channels has been used for both stereo models. The extracted stereo parameters have been non-uniformly quantized based on the spatial accuracy and the frequency dependency of the human auditory system. The data rate of the stereo parameters has been estimated for each model to around 4 kbps. As a result G.719 has been used as a core codec at 44 and 60 kbps in order to subjectively evaluate the performance of the fullband stereo codec at 48 and 64 kbps. In the comparison with G.719 dual mono coding, i.e. independent mono coding of the stereo channels, the evaluation showed a higher performance of the proposed stereo models for complex clean and reverberant speech signals. However, no consistent gain of the parametric stereo coding was revealed for noisy speech, mixed content and music signals. In addition, the first stereo model showed consistently a slightly higher performance than the second model in the subjective evaluation but with no significant difference. The results revealed a high potential for parametric stereo coding using the ITU-T G.719 codec. In comparison to the existing stereo codecs 3GPP AMR-WB+ and 3GPP eAAC+ the average performance was better at the equal bitrate of 48 kbps.
2

Approches paramétriques pour le codage audio multicanal

Lapierre, Jimmy January 2007 (has links)
Résumé : Afin de répondre aux besoins de communication et de divertissement, il ne fait aucun doute que la parole et l’audio doivent être encodés sous forme numérique. En qualité CD, cela nécessite un débit numérique de 1411.2 kb/s pour un signal stéréo-phonique. Une telle quantité de données devient rapidement prohibitive pour le stockage de longues durées d’audio ou pour la transmission sur certains réseaux, particulièrement en temps réel (d’où l’adhésion universelle au format MP3). De plus, ces dernières années, la quantité de productions musicales et cinématographiques disponibles en cinq canaux et plus ne cesse d’augmenter. Afin de maintenir le débit numérique à un niveau acceptable pour une application donnée, il est donc naturel pour un codeur audio à bas débit d’exploiter la redondance entre les canaux et la psychoacoustique binaurale. Le codage perceptuel et plus particulièrement le codage paramétrique permet d’atteindre des débits manifestement inférieurs en exploitant les limites de l’audition humaine (étudiées en psychoacoustique). Cette recherche se concentre donc sur le codage paramétrique à bas débit de plus d’un canal audio. // Abstract : In order to fulfill our communications and entertainment needs, there is no doubt that speech and audio must be encoded in digital format. In"CD" quality, this requires a bit-rate of 1411.2 kb/s for a stereo signal. Such a large amount of data quickly becomes prohibitive for long-term storage of audio or for transmitting on some networks, especially in real-time (leading to a universal adhesion to the MP3 format). Moreover, throughout the course of these last years, the number of musical and cinematographic productions available in five channels or more continually increased.In order to maintain an acceptable bit-rate for any given application, it is obvious that a low bit-rate audio coder must exploit the redundancies between audio channels and binaural psychoacoustics. Perceptual audio coding, and more specifically parametric audio coding, offers the possibility of achieving much lower bit-rates by taking into account the limits of human hearing (psychoacoustics). Therefore, this research concentrates on parametric audio coding of more than one audio channel.

Page generated in 0.0913 seconds