• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6
  • Tagged with
  • 6
  • 6
  • 5
  • 4
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

IMPLEMENTATION  AND EVALUATION OF AUDITORY MODELS FOR HUMAN ECHOLOCATION

Gidla, Vijay Kiran January 2016 (has links)
Blind  people use echoes to detect  objects  and  to  find their  way, the  ability  being known as human echolocation.   Previous  research  have found some of the  favorable  conditions  for the  detection  of the object,  with  many  factors  yet  to  be analyzed  and  quantified.    Studies  have  also shown  that blind people are more efficient than  the  sighted  in echolocating,  with  the  performance  varying  among  the individuals.   This  motivated the  research  in human  echolocation  to move in a new direction  to get a fuller understanding for the high detection  of the blind.  The psychoacoustic  experiments solely cannot determine  how the superior echo detection  of the blind listeners should be attributed to perceptual or physiological causes.  Along with the perceptual results it is vital to know how the sounds are processed in the  auditory system.   Hearing  research  has led to the  development of several auditory  models by combining  the  physiological  and  psychological  results  with  signal  analysis  methods.    These  models try  to describe how the auditory system  processes the signals.  Hence, to analyze how the sounds are processed for the high detection  of the blind, auditory  models available  in the literature were used in this thesis.  The results  suggest  that repetition pitch  is useful at  shorter  distances  and is determined from the peaks in the temporal  profile of the autocorrelation function computed  on the neural activity pattern. Loudness attribute also plays a role in providing information  for the listeners to echolocate at shorter  distances.  At longer distances  timbre  aspects such as sharpness  information  might be used by the listeners  to detect  the objects.  It was also found that the repetition pitch,  loudness and sharpness attributes in turn  depend on the room acoustics  and type of the stimuli  used.  These results  show the fruitfulness  of combining  results  from different  disciplines  through  a mathematical framework  given by signal analysis.
2

Incorporating Auditory Models in Speech/Audio Applications

January 2011 (has links)
abstract: Following the success in incorporating perceptual models in audio coding algorithms, their application in other speech/audio processing systems is expanding. In general, all perceptual speech/audio processing algorithms involve minimization of an objective function that directly/indirectly incorporates properties of human perception. This dissertation primarily investigates the problems associated with directly embedding an auditory model in the objective function formulation and proposes possible solutions to overcome high complexity issues for use in real-time speech/audio algorithms. Specific problems addressed in this dissertation include: 1) the development of approximate but computationally efficient auditory model implementations that are consistent with the principles of psychoacoustics, 2) the development of a mapping scheme that allows synthesizing a time/frequency domain representation from its equivalent auditory model output. The first problem is aimed at addressing the high computational complexity involved in solving perceptual objective functions that require repeated application of auditory model for evaluation of different candidate solutions. In this dissertation, a frequency pruning and a detector pruning algorithm is developed that efficiently implements the various auditory model stages. The performance of the pruned model is compared to that of the original auditory model for different types of test signals in the SQAM database. Experimental results indicate only a 4-7% relative error in loudness while attaining up to 80-90 % reduction in computational complexity. Similarly, a hybrid algorithm is developed specifically for use with sinusoidal signals and employs the proposed auditory pattern combining technique together with a look-up table to store representative auditory patterns. The second problem obtains an estimate of the auditory representation that minimizes a perceptual objective function and transforms the auditory pattern back to its equivalent time/frequency representation. This avoids the repeated application of auditory model stages to test different candidate time/frequency vectors in minimizing perceptual objective functions. In this dissertation, a constrained mapping scheme is developed by linearizing certain auditory model stages that ensures obtaining a time/frequency mapping corresponding to the estimated auditory representation. This paradigm was successfully incorporated in a perceptual speech enhancement algorithm and a sinusoidal component selection task. / Dissertation/Thesis / Ph.D. Electrical Engineering 2011
3

Probabilistic Modelling of Hearing : Speech Recognition and Optimal Audiometry

Stadler, Svante January 2009 (has links)
<p>Hearing loss afflicts as many as 10\% of our population.Fortunately, technologies designed to alleviate the effects ofhearing loss are improving rapidly, including cochlear implantsand the increasing computing power of digital hearing aids. Thisthesis focuses on theoretically sound methods for improvinghearing aid technology. The main contributions are documented inthree research articles, which treat two separate topics:modelling of human speech recognition (Papers A and B) andoptimization of diagnostic methods for hearing loss (Paper C).Papers A and B present a hidden Markov model-based framework forsimulating speech recognition in noisy conditions using auditorymodels and signal detection theory. In Paper A, a model of normaland impaired hearing is employed, in which a subject's pure-tonehearing thresholds are used to adapt the model to the individual.In Paper B, the framework is modified to simulate hearing with acochlear implant (CI). Two models of hearing with CI arepresented: a simple, functional model and a biologically inspiredmodel. The models are adapted to the individual CI user bysimulating a spectral discrimination test. The framework canestimate speech recognition ability for a given hearing impairmentor cochlear implant user. This estimate could potentially be usedto optimize hearing aid settings.Paper C presents a novel method for sequentially choosing thesound level and frequency for pure-tone audiometry. A Gaussianmixture model (GMM) is used to represent the probabilitydistribution of hearing thresholds at 8 frequencies. The GMM isfitted to over 100,000 hearing thresholds from a clinicaldatabase. After each response, the GMM is updated using Bayesianinference. The sound level and frequency are chosen so as tomaximize a predefined objective function, such as the entropy ofthe probability distribution. It is found through simulation thatan average of 48 tone presentations are needed to achieve the sameaccuracy as the standard method, which requires an average of 135presentations.</p>
4

Probabilistic Modelling of Hearing : Speech Recognition and Optimal Audiometry

Stadler, Svante January 2009 (has links)
Hearing loss afflicts as many as 10\% of our population.Fortunately, technologies designed to alleviate the effects ofhearing loss are improving rapidly, including cochlear implantsand the increasing computing power of digital hearing aids. Thisthesis focuses on theoretically sound methods for improvinghearing aid technology. The main contributions are documented inthree research articles, which treat two separate topics:modelling of human speech recognition (Papers A and B) andoptimization of diagnostic methods for hearing loss (Paper C).Papers A and B present a hidden Markov model-based framework forsimulating speech recognition in noisy conditions using auditorymodels and signal detection theory. In Paper A, a model of normaland impaired hearing is employed, in which a subject's pure-tonehearing thresholds are used to adapt the model to the individual.In Paper B, the framework is modified to simulate hearing with acochlear implant (CI). Two models of hearing with CI arepresented: a simple, functional model and a biologically inspiredmodel. The models are adapted to the individual CI user bysimulating a spectral discrimination test. The framework canestimate speech recognition ability for a given hearing impairmentor cochlear implant user. This estimate could potentially be usedto optimize hearing aid settings.Paper C presents a novel method for sequentially choosing thesound level and frequency for pure-tone audiometry. A Gaussianmixture model (GMM) is used to represent the probabilitydistribution of hearing thresholds at 8 frequencies. The GMM isfitted to over 100,000 hearing thresholds from a clinicaldatabase. After each response, the GMM is updated using Bayesianinference. The sound level and frequency are chosen so as tomaximize a predefined objective function, such as the entropy ofthe probability distribution. It is found through simulation thatan average of 48 tone presentations are needed to achieve the sameaccuracy as the standard method, which requires an average of 135presentations.
5

Using Auditory Modalities to Develop Rhythmic Competency in Children's Fundamental Movement Skills

Severy, Sally Suzanne 01 January 2016 (has links)
Physical education classrooms often have low levels of moderate to vigorous physical activity levels. This is a problem since many young elementary students are not building a foundation of fundamental movement skills necessary to be lifelong participants in physical activities. This study investigated how elementary physical education teachers used auditory modalities in their classrooms. The research question explored the emergence of rhythmic competency in fundamental movement skills to increase overall moderate to vigorous activity levels. This concurrent, mixed-methods, multiple case study used a constructivist paradigm using the schema and dynamic system theories as the underlying motor system theoretical framework. Two research sites were selected: a suburban Maryland public school system and a private liberal arts college located in the same county. The participants included 21 elementary physical education teachers and 6 physical education or exercise science majors from nationally recognized programs. Data were collected from a focus group, interviews, classroom observations, and a 10-item response Likert style survey designed for elementary physical education teachers to recognize current trends in the field of auditory modalities and rhythmic competency. The data were analyzed to identify auditory modality instructional methods for the emergence of rhythmic competencies. The results consisted of a list of best practices for use such as musical rhythms, verbal cues, and sound cues by physical education teachers and specialists. This research promotes positive social change by providing information for successfully planning interventions in the discipline of motor skill and rhythmic development that can lead to overall increased more-vigorous physical activity.
6

Perceptually motivated speech recognition and mispronunciation detection

Koniaris, Christos January 2012 (has links)
This doctoral thesis is the result of a research effort performed in two fields of speech technology, i.e., speech recognition and mispronunciation detection. Although the two areas are clearly distinguishable, the proposed approaches share a common hypothesis based on psychoacoustic processing of speech signals. The conjecture implies that the human auditory periphery provides a relatively good separation of different sound classes. Hence, it is possible to use recent findings from psychoacoustic perception together with mathematical and computational tools to model the auditory sensitivities to small speech signal changes. The performance of an automatic speech recognition system strongly depends on the representation used for the front-end. If the extracted features do not include all relevant information, the performance of the classification stage is inherently suboptimal. The work described in Papers A, B and C is motivated by the fact that humans perform better at speech recognition than machines, particularly for noisy environments. The goal is to make use of knowledge of human perception in the selection and optimization of speech features for speech recognition. These papers show that maximizing the similarity of the Euclidean geometry of the features to the geometry of the perceptual domain is a powerful tool to select or optimize features. Experiments with a practical speech recognizer confirm the validity of the principle. It is also shown an approach to improve mel frequency cepstrum coefficients (MFCCs) through offline optimization. The method has three advantages: i) it is computationally inexpensive, ii) it does not use the auditory model directly, thus avoiding its computational cost, and iii) importantly, it provides better recognition performance than traditional MFCCs for both clean and noisy conditions. The second task concerns automatic pronunciation error detection. The research, described in Papers D, E and F, is motivated by the observation that almost all native speakers perceive, relatively easily, the acoustic characteristics of their own language when it is produced by speakers of the language. Small variations within a phoneme category, sometimes different for various phonemes, do not change significantly the perception of the language’s own sounds. Several methods are introduced based on similarity measures of the Euclidean space spanned by the acoustic representations of the speech signal and the Euclidean space spanned by an auditory model output, to identify the problematic phonemes for a given speaker. The methods are tested for groups of speakers from different languages and evaluated according to a theoretical linguistic study showing that they can capture many of the problematic phonemes that speakers from each language mispronounce. Finally, a listening test on the same dataset verifies the validity of these methods. / <p>QC 20120914</p> / European Union FP6-034362 research project ACORNS / Computer-Animated language Teachers (CALATea)

Page generated in 0.064 seconds