Global ETD Search

1	Compensation for Nonlinear Distortion in Noise for Robust Speech Recognition Harvilla, Mark J. 01 October 2014 (has links) The performance, reliability, and ubiquity of automatic speech recognition systems has flourished in recent years due to steadily increasing computational power and technological innovations such as hidden Markov models, weighted finite-state transducers, and deep learning methods. One problem which plagues speech recognition systems, especially those that operate offline and have been trained on specific in-domain data, is the deleterious effect of noise on the accuracy of speech recognition. Historically, robust speech recognition research has focused on traditional noise types such as additive noise, linear filtering, and reverberation. This thesis describes the effects of nonlinear dynamic range compression on automatic speech recognition and develops a number of novel techniques for characterizing and counteracting it. Dynamic range compression is any function which reduces the dynamic range of an input signal. Dynamic range compression is a widely-used tool in audio engineering and is almost always a component of a practical telecommunications system. Despite its ubiquity, this thesis is the first work to comprehensively study and address the effect of dynamic range compression on speech recognition. More specifically, this thesis treats the problem of dynamic range compression in three ways: (1) blind amplitude normalization methods, which counteract dynamic range compression when its parameter values allow the function to be mathematically inverted, (2) blind amplitude reconstruction techniques, i.e., declipping, which attempt to reconstruct clipped segments of the speech signal that are lost through non-invertible dynamic range compression, and (3) matched-training techniques, which attempt to select the pre-trained acoustic model with the closest set of compression parameters. All three of these methods rely on robust estimation of the dynamic range compression distortion parameters. Novel algorithms for the blind prediction of these parameters are also introduced. The algorithms' quality is evaluated in terms of the degree to which they decrease speech recognition word error rate, as well as in terms of the degree to which they increase a given speech signal's signal-to-noise ratio. In all evaluations, the possibility of independent additive noise following the application of dynamic range compression is assumed. signal processing speech recognition speech enhancement audio declipping noise reduction dynamic range compression
2	Restaurace signálu s omezenou okamžitou hodnotou pro vícekanálový audio signál / Restoration of signals with limited instantaneous value for the multichannel audio signal Hájek, Vojtěch January 2019 (has links) This master’s thesis deals with the restoration of clipped multichannel audio signals based on sparse representations. First, a general theory of clipping and theory of sparse representations of audio signals is described. A short overview of existing restoration methods is part of this thesis as well. Subsequently, two declipping algorithms are introduced and are also implemented in the Matlab environment as a part of the thesis. The first one, SPADE, is considered a state- of-the-art method for mono audio signals declipping and the second one, CASCADE, which is derived from SPADE, is designed for the restoration of multichannel signals. In the last part of the thesis, both algorithms are tested and the results are compared using the objective measures SDR and PEAQ, and also using the subjective listening test MUSHRA.
3	Restaurace audiosignálů založená na řídkých reprezentacích / Audio restoration based on sparse signal representations Záviška, Pavel January 2017 (has links) This Master's Thesis deals with the issue of audio clipping and the application of sparse represenations model for the task of declipping. First, a general theory of clipping is described, followed by a brief overview of existing methods and a description of the general theory concerning sparse representations of signals and bases, respectively frames. Subsequently, two methods solving declipping problem based on sparse representations are intruduced. The first method uses the Generic proximal algorithm for convex optimization, the second one uses the Douglas-Rachford algorithm. The above mentioned methods have been programmed in the Matlab environment. The results of the declipping methods are evaluated according to SNR, PEMO-Q and also by subjective listening tests.
4	Restaurace signálu s omezenou okamžitou hodnotou s použitím psychoakustického modelu / Restoration of signals with limited instantaneous value using a psychoacoustic model Beňo, Tomáš January 2019 (has links) The master's thesis deals with the restoration of audio signals that have been damaged by clipping. Used methods are based on sparse representations of signals. The introduction of the thesis explains the issue of clipping and mentions the list of already existing methods that solve declipping, which are followed by the thesis. In the next chapter, the necessary theory of sparse representations and the proximal algorithms is described, including specific representatives from the category of convex optimization problems. The thesis contains declipping algorithm implemented in Matlab software environment. Chosen method for solving the task uses the Condat algorithm or Generic proximal algorithm for convex optimization and solves minimization of sum of three convex functions. The result of the thesis is five versions of algorithm and three of them have implemented psychoacoustic model for results improvement. For each version has been found optimal setting of parameters. The restoration quality results are evaluated using objective measurements like SDR and PEMO-Q and also using subjective listening test.

1

Page generated in 0.0385 seconds