• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • 1
  • Tagged with
  • 4
  • 4
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

On the Enhancement of Audio and Video in Mobile Equipment

Rossholm, Andreas January 2006 (has links)
Use of mobile equipment has increased exponentially over the last decade. As use becomes more widespread so too does the demand for new functionalities. The limited memory and computational power of many mobile devices has proven to be a challenge resulting in many innovative solutions and a number of new standards. Despite this, there is often a requirement for additional enhancement to improve quality. The focus of this thesis work has been to perform enhancement within two different areas; audio or speech encoding and video encoding/decoding. The audio enhancement section of this thesis addresses the well known problem in the GSM system with an interfering signal generated by the switching nature of TDMA cellular telephony. Two different solutions are given to suppress such interference internally in the mobile handset. The first method involves the use of subtractive noise cancellation employing correlators, the second uses a structure of IIR noth filters. Both solutions use control algorithms based on the state of the communication between the mobile handset and the base station. The video section of this thesis presents two post-filters and one pre-filter. The two post-filters are designed to improve visual quality of highly compressed video streams from standard, block-based video codecs by combating both blocking and ringing artifacts. The second post-filter also performs sharpening. The pre-filter is designed to increase the coding efficiency of a standard block based video codec. By introducing a pre-processing algorithm before the encoder, the amount of camera disturbance and the complexity of the sequence can be decreased, thereby increasing coding efficiency.
2

Real-Time Adaptive Audio Mixing System Using Inter-Spectral Dependencies

Koria, Robert January 2016 (has links)
The process of mixing tracks for a live stage performance or studio session is both time consuming and expensive with assistance of professionals. It is also difficult for individuals to remain competitive against established companies, since multiple tracks must be properly mixed in order to achieve well-enhanced elements -- generally, a poor mix makes it difficult for the listener to distinguish the different elements of the mix. The developed method during this thesis work aims at facilitating the mixing work for live performances and studio sessions. The implemented system analyzes the energy spectrum of the tracks included in the mix. By unmasking spectral components, the spectral overlap of the tracks is minimized. The system filters non-characteristic frequencies, leaving significant frequencies undisturbed. Five tracks have been used from the final mix of a successful radio song. These tracks have been analyzed and used to illustrate and validate the developed method. The system was successfully implemented in MATLAB with promising results and conclusions. The processed mix unmasks frequency content and is perceived to sound clearer compared to the unprocessed mix by a number of test individuals. The method reminds of a multi-band compressor that analyzes the spectral information between tracks. Thus, by use of inter-spectral dependencies, the thesis investigates the possibility to control the amplitudes in time by filtration in frequency domain. The compression rate in time domain is reflected in regard to a trade-off between conservation of characteristic frequencies and reduction of spectral overlaps.
3

Amélioration de codecs audio standardisés avec maintien de l'interopérabilité

Lapierre, Jimmy January 2016 (has links)
Résumé : L’audio numérique s’est déployé de façon phénoménale au cours des dernières décennies, notamment grâce à l’établissement de standards internationaux. En revanche, l’imposition de normes introduit forcément une certaine rigidité qui peut constituer un frein à l’amélioration des technologies déjà déployées et pousser vers une multiplication de nouveaux standards. Cette thèse établit que les codecs existants peuvent être davantage valorisés en améliorant leur qualité ou leur débit, même à l’intérieur du cadre rigide posé par les standards établis. Trois volets sont étudiés, soit le rehaussement à l’encodeur, au décodeur et au niveau du train binaire. Dans tous les cas, la compatibilité est préservée avec les éléments existants. Ainsi, il est démontré que le signal audio peut être amélioré au décodeur sans transmettre de nouvelles informations, qu’un encodeur peut produire un signal amélioré sans ajout au décodeur et qu’un train binaire peut être mieux optimisé pour une nouvelle application. En particulier, cette thèse démontre que même un standard déployé depuis plusieurs décennies comme le G.711 a le potentiel d’être significativement amélioré à postériori, servant même de cœur à un nouveau standard de codage par couches qui devait préserver cette compatibilité. Ensuite, les travaux menés mettent en lumière que la qualité subjective et même objective d’un décodeur AAC (Advanced Audio Coding) peut être améliorée sans l’ajout d’information supplémentaire de la part de l’encodeur. Ces résultats ouvrent la voie à davantage de recherches sur les traitements qui exploitent une connaissance des limites des modèles de codage employés. Enfin, cette thèse établit que le train binaire à débit fixe de l’AMR WB+ (Extended Adaptive Multi-Rate Wideband) peut être compressé davantage pour le cas des applications à débit variable. Cela démontre qu’il est profitable d’adapter un codec au contexte dans lequel il est employé. / Abstract : Digital audio applications have grown exponentially during the last decades, in good part because of the establishment of international standards. However, imposing such norms necessarily introduces hurdles that can impede the improvement of technologies that have already been deployed, potentially leading to a proliferation of new standards. This thesis shows that existent coders can be better exploited by improving their quality or their bitrate, even within the rigid constraints posed by established standards. Three aspects are studied, being the enhancement of the encoder, the decoder and the bit stream. In every case, the compatibility with the other elements of the existent coder is maintained. Thus, it is shown that the audio signal can be improved at the decoder without transmitting new information, that an encoder can produce an improved signal without modifying its decoder, and that a bit stream can be optimized for a new application. In particular, this thesis shows that even a standard like G.711, which has been deployed for decades, has the potential to be significantly improved after the fact. This contribution has even served as the core for a new standard embedded coder that had to maintain that compatibility. It is also shown that the subjective and objective audio quality of the AAC (Advanced Audio Coding) decoder can be improved, without adding any extra information from the encoder, by better exploiting the knowledge of the coder model’s limitations. Finally, it is shown that the fixed rate bit stream of the AMR-WB+ (Extended Adaptive Multi-Rate Wideband) can be compressed more efficiently when considering a variable bit rate scenario, showing the need to adapt a coder to its use case.
4

Méthodes avancées de traitement de la parole et de réduction de bruit pour les terminaux mobiles / Advanced methods of speech processing and noise reduction for mobile devices

Mai, Van Khanh 09 March 2017 (has links)
Cette thèse traite d'un des problèmes les plus stimulants dans le traitement de la parole concernant la prothèse auditive, où seulement un capteur est disponible avec de faibles coûts de calcul, de faible utilisation d'énergie et l'absence de bases de données. Basée sur les récents résultats dans les deux estimations statistiques paramétriques et non-paramétriques, ainsi que la représentation parcimonieuse. Cette étude propose quelques techniques non seulement pour améliorer la qualité et l'intelligibilité de la parole, mais aussi pour s'attaquer au débruitage du signal audio en général.La thèse est divisée en deux parties ; Dans la première partie, on aborde le problème d'estimation de la densité spectrale de puissance du bruit, particulièrement pour le bruit non-stationnaire. Ce problème est une des parties principales du traitement de la parole du mono-capteur. La méthode proposée prend en compte le modèle parcimonieux de la parole dans le domaine transféré. Lorsque la densité spectrale de puissance du bruit est estimée, une approche sémantique est exploitée pour tenir compte de la présence ou de l'absence de la parole dans la deuxième partie. En combinant l'estimation Bayésienne et la détection Neyman-Pearson, quelques estimateurs paramétriques sont développés et testés dans le domaine Fourier. Pour approfondir la performance et la robustesse de débruitage du signal audio, une approche semi-paramétrique est considérée. La conjointe détection et estimation peut être interprétée par Smoothed Sigmoid-Based Shrinkage (SSBS). Ainsi, la méthode Bloc-SSBS est proposée afin de prendre en compte les atomes voisinages dans le domaine temporel-fréquentiel. De plus, pour améliorer fructueusement la qualité de la parole et du signal audio, un estimateur Bayésien est aussi dérivé et combiné avec la méthode Bloc-SSBS. L'efficacité et la pertinence de la stratégie dans le domaine transformée cosinus pour les débruitages de la parole et de l'audio sont confirmées par les résultats expérimentaux. / This PhD thesis deals with one of the most challenging problem in speech enhancement for assisted listening where only one micro is available with the low computational cost, the low power usage and the lack out of the database. Based on the novel and recent results both in non-parametric and parametric statistical estimation and sparse representation, this thesis work proposes several techniques for not only improving speech quality and intelligibility and but also tackling the denoising problem of the other audio signal. In the first major part, our work addresses the problem of the noise power spectrum estimation, especially for non-stationary noise, that is the key part in the single channel speech enhancement. The proposed approach takes into account the weak-sparseness model of speech in the transformed model. Once the noise power spectrum has been estimated, a semantic road is exploited to take into consideration the presence or absence of speech in the second major part. By applying the joint of the Bayesian estimator and the Neyman-Pearson detection, some parametric estimators were developed and tested in the discrete Fourier transform domain. For further improve performance and robustness in audio denoising, a semi-parametric approach is considered. The joint detection and estimation can be interpreted by Smoothed Sigmoid-Based Shrinkage (SSBS). Thus, Block-SSBS is proposed to take into additionally account the neighborhood bins in the time-frequency domain. Moreover, in order to enhance fruitfully speech and audio, a Bayesian estimator is also derived and combined with Block-SSBS. The effectiveness and relevance of this strategy in the discrete Cosine transform for both speech and audio denoising are confirmed by experimental results.

Page generated in 0.0874 seconds