• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • Tagged with
  • 4
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Auditory Based Modification of MFCC Feature Extraction for Robust Automatic Speech Recognition

Chiou, Sheng-chiuan 01 September 2009 (has links)
The human auditory perception system is much more noise-robust than any state-of theart automatic speech recognition (ASR) system. It is expected that the noise-robustness of speech feature vectors may be improved by employing more human auditory functions in the feature extraction procedure. Forward masking is a phenomenon of human auditory perception, that a weaker sound is masked by the preceding stronger masker. In this work, two human auditory mechanisms, synaptic adaptation and temporal integration are implemented by filter functions and incorporated to model forward masking into MFCC feature extraction. A filter optimization algorithm is proposed to optimize the filter parameters. The performance of the proposed method is evaluated on Aurora 3 corpus, and the procedure of training/testing follows the standard setting provided by the Aurora 3 task. The synaptic adaptation filter achieves relative improvements of 16.6% over the baseline. The temporal integration and modified temporal integration filter achieve relative improvements of 21.6% and 22.5% respectively. The combination of synaptic adaptation with each of temporal integration filters results in further improvements of 26.3% and 25.5%. Applying the filter optimization improves the synaptic adaptation filter and two temporal integration filters, results in the 18.4%, 25.2%, 22.6% improvements respectively. The performance of the combined-filters models are also improved, the relative improvement are 26.9% and 26.3%.
2

DEVELOPMENT AND VALIDATION OF NEW MODELS AND METRICS FOR THE ASSESSMENTS OF NOISE-INDUCED HEARING LOSS

Al-Dayyeni, Wisam Subhi Talib 01 May 2019 (has links) (PDF)
Noise-induced hearing loss (NIHL) is one of the most common illnesses that is frequently reported in the occupational and military sectors. Hearing loss due to high noise exposure is a major health problem with economic consequences. Industrial and military noise exposures often contain high-level impulsive noise components. The presence of these impulsive noise components complicates the assessment of noise levels for hearing conservation purposes. The current noise guidelines use equal energy hypothesis (EEH) based metrics to evaluate the risk of hearing loss. A number of studies show that the current noise metrics often underestimates the risk of hearing loss in high-level complex noise environments. The overarching goal of this dissertation is to develop advance signal processing based methods for more accurate assessments of the risk of NIHL. For these assessments, various auditory filters that take into account the physiological characteristics of the ear are used. These filters will help to understand the complexity of the ear’s response to high-level complex noises.
3

Speech Analysis and Cognition Using Category-Dependent Features in a Model of the Central Auditory System

Jeon, Woojay 13 November 2006 (has links)
It is well known that machines perform far worse than humans in recognizing speech and audio, especially in noisy environments. One method of addressing this issue of robustness is to study physiological models of the human auditory system and to adopt some of its characteristics in computers. As a first step in studying the potential benefits of an elaborate computational model of the primary auditory cortex (A1) in the central auditory system, we qualitatively and quantitatively validate the model under existing speech processing recognition methodology. Next, we develop new insights and ideas on how to interpret the model, and reveal some of the advantages of its dimension-expansion that may be potentially used to improve existing speech processing and recognition methods. This is done by statistically analyzing the neural responses to various classes of speech signals and forming empirical conjectures on how cognitive information is encoded in a category-dependent manner. We also establish a theoretical framework that shows how noise and signal can be separated in the dimension-expanded cortical space. Finally, we develop new feature selection and pattern recognition methods to exploit the category-dependent encoding of noise-robust cognitive information in the cortical response. Category-dependent features are proposed as features that "specialize" in discriminating specific sets of classes, and as a natural way of incorporating them into a Bayesian decision framework, we propose methods to construct hierarchical classifiers that perform decisions in a two-stage process. Phoneme classification tasks using the TIMIT speech database are performed to quantitatively validate all developments in this work, and the results encourage future work in exploiting high-dimensional data with category(or class)-dependent features for improved classification or detection.
4

Generalized Analytic Signal Construction and Modulation Analysis

Venkitaraman, Arun January 2013 (has links) (PDF)
This thesis deals with generalizations of the analytic signal (AS) construction proposed by Gabor. Functional extensions of the fractional Hilbert Transform (FrHT) are proposed using which families of analytic signals are obtained. The construction is further applied in the design of a secure communication scheme. A demodulation scheme is developed based on the generalized AS, motivated by perceptual experiments in binaural hearing. Demodulation is achieved using a signal and its arbitrary phase-shifted version which, in turn translated to demodulation using a pair of flat-top bandpass filters that form an FrHT parir. A new family of wavelets based on the popular Gammatone auditory model is proposed and is shown to lead to a good characterization of singularities/transients in a signal. Allied problems of computing smooth amplitude, phase, and frequency modulations from the AS. Construction of FrHT pair of wavelets, and temporal envelope fit of transient audio signals are also addressed.

Page generated in 0.077 seconds