• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • Tagged with
  • 5
  • 5
  • 5
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Using observation uncertainty for robust speech recognition

Arrowood, Jon A., January 2003 (has links) (PDF)
Thesis (Ph. D.)--School of Electrical and Computer Engineering, Georgia Institute of Technology, 2004. Directed by Mark A. Clements. / Vita. Includes bibliographical references (leaves 124-128).
2

The Value of Two Ears for Sound Source Localization and Speech Understanding in Complex Listening Environments: Two Cochlear Implants vs. Two Partially Hearing Ears and One Cochlear Implant

January 2013 (has links)
abstract: Two groups of cochlear implant (CI) listeners were tested for sound source localization and for speech recognition in complex listening environments. One group (n=11) wore bilateral CIs and, potentially, had access to interaural level difference (ILD) cues, but not interaural timing difference (ITD) cues. The second group (n=12) wore a single CI and had low-frequency, acoustic hearing in both the ear contralateral to the CI and in the implanted ear. These `hearing preservation' listeners, potentially, had access to ITD cues but not to ILD cues. At issue in this dissertation was the value of the two types of information about sound sources, ITDs and ILDs, for localization and for speech perception when speech and noise sources were separated in space. For Experiment 1, normal hearing (NH) listeners and the two groups of CI listeners were tested for sound source localization using a 13 loudspeaker array. For the NH listeners, the mean RMS error for localization was 7 degrees, for the bilateral CI listeners, 20 degrees, and for the hearing preservation listeners, 23 degrees. The scores for the two CI groups did not differ significantly. Thus, both CI groups showed equivalent, but poorer than normal, localization. This outcome using the filtered noise bands for the normal hearing listeners, suggests ILD and ITD cues can support equivalent levels of localization. For Experiment 2, the two groups of CI listeners were tested for speech recognition in noise when the noise sources and targets were spatially separated in a simulated `restaurant' environment and in two versions of a `cocktail party' environment. At issue was whether either CI group would show benefits from binaural hearing, i.e., better performance when the noise and targets were separated in space. Neither of the CI groups showed spatial release from masking. However, both groups showed a significant binaural advantage (a combination of squelch and summation), which also maintained separation of the target and noise, indicating the presence of some binaural processing or `unmasking' of speech in noise. Finally, localization ability in Experiment 1 was not correlated with binaural advantage in Experiment 2. / Dissertation/Thesis / Ph.D. Speech and Hearing Science 2013
3

Development and validation of a South African English smartphone-based speech-in-noise hearing test

Engelbrecht, Jenni-Mari January 2017 (has links)
Approximately 80% of the adult and elderly population ≥65 years have not been assessed or treated for a hearing loss, despite the effect a hearing loss has on communication and quality of life (World Health Organization [WHO], 2013a). In South Africa, many challenges to the health care system exist of which access to ear and hearing health care is one of the major problems. This study aimed to develop and validate a smartphone-based digits-in-noise hearing test for South African English towards improved access to hearing screening. The study also considered the effect of hearing loss and English speaking competency on the South African English digits-in-noise hearing test to evaluate its suitability for use across native (N) and non-native (NN) speakers. Lastly, the study evaluated the digits-in-noise test’s applicability as part of the diagnostic audiometric test battery as a clinical test to measure speech recognition ability in noise. During the development and validation phase of this study the sample size consisted of 40 normal-hearing subjects with thresholds ≤15 dB across the frequency spectrum (250 – 8000 Hertz [Hz]) and 186 subjects with normal-hearing in both ears, or normal-hearing in the better ear. Single digits (0 – 9) were recorded and spoken by a N English female speaker. Level corrections were applied to create a set of homogeneous digits with steep speech recognition functions. A smartphone application (app) was created to utilize 120 digit-triplets in noise as test material. An adaptive test procedure determined the speech reception threshold (SRT). Experiments were performed to determine headphones effects on the SRT and to establish normative data. The results showed steep speech recognition functions with a slope of 20%/dB for digit-triplets presented in noise using the smartphone app. The results of five headphone types indicate that the smartphone-based hearing test is reliable and can be conducted using standard Android smartphone headphones or clinical headphones. A prospective cross-sectional cohort study of N and NN English adults with and without sensorineural hearing loss compared pure-tone air conduction thresholds to the SRT recorded with the smartphone digits-in-noise hearing test. A rating scale was used for NN English listeners’ self-reported competence in speaking English. This study consisted of 454 adult listeners (164 male, 290 female; range 16 – 90 years), of which 337 listeners had a best ear 4 frequency pure-tone average (4FPTA; 0.5, 1, 2 and 4 kHz) of ≤25 dB hearing level (HL). A linear regression model identified three predictors of the digits-in-noise SRT namely 4FPTA, age and self-reported English speaking competence. The NN group with poor self-reported English speaking competence (≤5/10) performed significantly (p<0.01) poorer than the N & NN (≥6/10) group on the digits-in-noise test. Screening characteristics of the test improved with separate cut-off values depending on self-reported English speaking competence for the N & NN (≥6/10) group and NN (≤5/10) group. Logistic regression models, that include age in the analysis, showed a further improvement in sensitivity and specificity for both groups (area under the receiver operator characteristic curve [AUROC] .962 and .903 respectively). A descriptive study evaluated 109 adult subjects (43 male, 66 female) with and without sensorineural hearing loss by comparing pure-tone air conduction thresholds, speech recognition monaural performance score intensity (SRS dB) and the digits-in-noise SRT. An additional nine adult hearing aid users (4 male, 5 female) was utilized in a subset to determine aided and unaided digits-in-noise SRTs. The digits-in-noise SRT was strongly associated with the best ear 4FPTA (r=0.81) and maximum SRS dB (r=0.72). The digits-in-noise test had high sensitivity and specificity to identify abnormal pure-tone (0.88 and 0.88 respectively) and SRS dB (0.76 and 0.88 respectively) results. There was a mean signal-to-noise ratio (SNR) improvement in the aided condition that demonstrated an overall benefit of 0.84 dB SNR. A significant individual variability between subjects in the aided condition (-3.2 to -9.4 dB SNR) and unaided condition (-2 to -9.4 dB SNR) was indicated. This study demonstrated that a smartphone app provides the opportunity to use the English digits-in-noise hearing test as a national test for South Africans. The smartphone app can accommodate NN listeners by adjusting reference scores based on a self-reported English speaking competence. The inclusion of age when determining the screening test result increases the accuracy of the screening test in normal-hearing listeners. Providing these adjustments can ensure adequate test performance across N English and NN English listeners. Furthermore, the digits-in-noise SRT is strongly associated with the best ear 4FPTA and maximum SRS dB and could therefore provide complementary information on speech recognition impairment in noise in a clinical audiometric setting. The digits-in-noise SRT can also demonstrate benefit for hearing aid fittings. The test is quick to administer and provides information on the SNR loss. The digits-in-noise SRT could therefore serve as a valuable tool in counselling and management of expectations for persons with hearing loss who receives amplification. / Thesis (PhD)--University of Pretoria, 2017. / National Research Foundation (NRF) / Speech-Language Pathology and Audiology / PhD / Unrestricted
4

Perception of prosody by cochlear implant recipients

Van Zyl, Marianne January 2014 (has links)
Recipients of present-day cochlear implants (CIs) display remarkable success with speech recognition in quiet, but not with speech recognition in noise. Normal-hearing (NH) listeners, in contrast, perform relatively well with speech recognition in noise. Understanding which speech features support successful perception in noise in NH listeners could provide insight into the difficulty that CI listeners experience in background noise. One set of speech features that has not been thoroughly investigated with regard to its noise immunity is prosody. Existing reports show that CI users have difficulty with prosody perception. The present study endeavoured to determine if prosody is particularly noise-immune in NH listeners and whether the difficulty that CI users experience in noise can be partly explained by poor prosody perception. This was done through the use of three listening experiments. The first listening experiment examined the noise immunity of prosody in NH listeners by comparing perception of a prosodic pattern to word recognition in speech-weighted noise (SWN). Prosody perception was tested in a two-alternatives forced-choice (2AFC) test paradigm using sentences conveying either conditional or unconditional permission, agreement or approval. Word recognition was measured in an open set test paradigm using meaningful sentences. Results indicated that the deterioration slope of prosody recognition (corrected for guessing) was significantly shallower than that of word recognition. At the lowest signal-to-noise ratio (SNR) tested, prosody recognition was significantly better than word recognition. The second experiment compared recognition of prosody and phonemes in SWN by testing perception of both in a 2AFC test paradigm. NH and CI listeners were tested using single words as stimuli. Two prosody recognition tasks were used; the first task required discrimination between questions and statements, while the second task required discrimination between a certain and a hesitant attitude. Phoneme recognition was measured with three vowel pairs selected according to specific acoustic cues. Contrary to the first experiment, the results of this experiment indicated that vowel recognition was significantly better than prosody recognition in noise in both listener groups. The difference between the results of the first and second experiments was thought to have been due to either the test paradigm difference in the first experiment (closed set versus open set), or a difference in stimuli between the experiments (single words versus sentences). The third experiment tested emotional prosody and phoneme perception of NH and CI listeners in SWN using sentence stimuli and a 4AFC test paradigm for both tasks. In NH listeners, deterioration slopes of prosody and phonemes (vowels and consonants) did not differ significantly, and at the lowest SNR tested there was no significant difference in recognition of the different types of speech material. In the CI group, prosody and vowel perception deteriorated with a similar slope, while consonant recognition showed a steeper slope than prosody recognition. It is concluded that while prosody might support speech recognition in noise in NH listeners, explicit recognition of prosodic patterns is not particularly noise-immune and does not account for the difficulty that CI users experience in noise. ## Ontvangers van hedendaagse kogleêre inplantings (KI’s) behaal merkwaardige sukses met spraakherkenning in stilte, maar nie met spraakherkenning in geraas nie. Normaalhorende (NH) luisteraars, aan die ander kant, vaar relatief goed met spraakherkenning in geraas. Begrip van die spraakeienskappe wat suksesvolle persepsie in geraas ondersteun in NH luisteraars, kan lei tot insig in die probleme wat KI-gebruikers in agtergrondgeraas ervaar. Een stel spraakeienskappe wat nog nie deeglik ondersoek is met betrekking tot ruisimmuniteit nie, is prosodie. Bestaande navorsing wys dat KI-gebruikers sukkel met persepsie van prosodie. Die huidige studie is onderneem om te bepaal of prosodie besonder ruisimmuun is in NH luisteraars en of die probleme wat KI-gebruikers in geraas ondervind, deels verklaar kan word deur swak prosodie-persepsie. Dit is gedoen deur middel van drie luistereksperimente. Die eerste luistereksperiment het die ruisimmuniteit van prosodie in NH luisteraars ondersoek deur die persepsie van ’n prosodiese patroon te vergelyk met woordherkenning in spraakgeweegde ruis (SGR). Prosodie-persepsie is getoets in ’n twee-alternatiewe-gedwonge-keuse- (2AGK) toetsparadigma met sinne wat voorwaardelike of onvoorwaardelike toestemming, instemming of goedkeuring oordra. Woordherkenning is gemeet in ’n oopstel-toetsparadigma met betekenisvolle sinne. Resultate het aangedui dat die helling van agteruitgang van prosodieherkenning (gekorrigeer vir raai) betekenisvol platter was as dié van woordherkenning, en dat by die laagste sein-tot-ruiswaarde (STR) wat getoets is, prosodieherkenning betekenisvol beter was as woordherkenning. Die tweede eksperiment het prosodie- en foneemherkenning in SGR vergelyk deur die persepsie van beide te toets in ’n 2AGK-toetsparadigma. NH en KI-luisteraars is getoets met enkelwoorde as stimuli. Twee prosodieherkenningstake is gebruik; die eerste taak het diskriminasie tussen vrae en stellings vereis, terwyl die tweede taak diskriminasie tussen ’n seker en onseker houding vereis het. Foneemherkenning is gemeet met drie vokaalpare wat geselekteer is na aanleiding van spesifieke akoestiese eienskappe. In teenstelling met die eerste eksperiment, het resultate van hierdie eksperiment aangedui dat vokaalherkenning betekenisvol beter was as prosodieherkenning in geraas in beide luisteraarsgroepe. Die verskil tussen die resultate van die eerste en tweede eksperimente kon moontlik die gevolg wees van óf die verskil in toetsparadigma in die eerste eksperiment (geslote- teenoor oop-stel), óf ’n verskil in stimuli tussen die eksperimente (enkelwoorde teenoor sinne). Die derde eksperiment het emosionele-prosodie- en foneempersepsie van NH en KI-luisteraars getoets in SGR met sinstimuli en ’n 4AGK-toetsparadigma vir beide take. In NH luisteraars het die helling van agteruitgang van die persepsie van prosodie en foneme (vokale en konsonante) nie betekenisvol verskil nie, en by die laagste STR wat getoets is, was daar nie ’n betekenisvolle verskil in die herkenning van die twee tipes spraakmateriaal nie. In die KI-groep het prosodie- en vokaalpersepsie met soortgelyke hellings agteruitgegaan, terwyl konsonantherkenning ’n steiler helling as prosodieherkenning vertoon het. Die gevolgtrekking was dat alhoewel prosodie spraakherkenning in geraas in NH luisteraars mag ondersteun, die eksplisiete herkenning van prosodiese patrone nie besonder ruisimmuun is nie en dus nie ’n verklaring bied vir die probleme wat KI-gebruikers in geraas ervaar nie. / Thesis (PhD)--University of Pretoria, 2014. / lk2014 / Electrical, Electronic and Computer Engineering / PhD / unrestricted
5

Musical Training Influences Auditory Temporal Processing

Elangovan, Saravanan, Payne, Nicole, Smurzynski, Jacek, Fagelson, Marc A. 12 March 2016 (has links)
Background: A link between musical expertise and auditory temporal processing abilities was examined. Material and methods: Trained musicians (n=13) and non-musicians (n=12) were tested on speech tasks (phonetic identification, speech recognition in noise) and non-speech tasks (temporal gap detection). Results: Results indicated musicians had shorter between-channel gap detection thresholds and sharper phonetic identification functions, suggesting that perceptual reorganization following musical training assists basic temporal auditory processes. Conclusions: In general, our results provide a conceptual advance in understanding how musical training influences speech processing, an ability which, when impaired, can affect speech and reading competency.

Page generated in 0.4265 seconds