• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 147
  • 26
  • 14
  • 12
  • 4
  • 4
  • 3
  • 2
  • 2
  • 1
  • Tagged with
  • 292
  • 213
  • 52
  • 47
  • 38
  • 35
  • 32
  • 30
  • 29
  • 29
  • 24
  • 23
  • 22
  • 21
  • 21
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

Towards Understanding Intelligibility of Velopharyngeal Insufficiency (VPI) Speech

Hashemi Hosseinabad, Hedieh January 2018 (has links)
No description available.
62

Factors Influencing the Prediction of Speech Intelligibility

Leopold, Sarah Yoho 01 September 2016 (has links)
No description available.
63

Intelligibility enhancement of synthetic speech in noise

Valentini Botinhão, Cássia January 2013 (has links)
Speech technology can facilitate human-machine interaction and create new communication interfaces. Text-To-Speech (TTS) systems provide speech output for dialogue, notification and reading applications as well as personalized voices for people that have lost the use of their own. TTS systems are built to produce synthetic voices that should sound as natural, expressive and intelligible as possible and if necessary be similar to a particular speaker. Although naturalness is an important requirement, providing the correct information in adverse conditions can be crucial to certain applications. Speech that adapts or reacts to different listening conditions can in turn be more expressive and natural. In this work we focus on enhancing the intelligibility of TTS voices in additive noise. For that we adopt the statistical parametric paradigm for TTS in the shape of a hidden Markov model (HMM-) based speech synthesis system that allows for flexible enhancement strategies. Little is known about which human speech production mechanisms actually increase intelligibility in noise and how the choice of mechanism relates to noise type, so we approached the problem from another perspective: using mathematical models for hearing speech in noise. To find which models are better at predicting intelligibility of TTS in noise we performed listening evaluations to collect subjective intelligibility scores which we then compared to the models’ predictions. In these evaluations we observed that modifications performed on the spectral envelope of speech can increase intelligibility significantly, particularly if the strength of the modification depends on the noise and its level. We used these findings to inform the decision of which of the models to use when automatically modifying the spectral envelope of the speech according to the noise. We devised two methods, both involving cepstral coefficient modifications. The first was applied during extraction while training the acoustic models and the other when generating a voice using pre-trained TTS models. The latter has the advantage of being able to address fluctuating noise. To increase intelligibility of synthetic speech at generation time we proposed a method for Mel cepstral coefficient modification based on the glimpse proportion measure, the most promising of the models of speech intelligibility that we evaluated. An extensive series of listening experiments demonstrated that this method brings significant intelligibility gains to TTS voices while not requiring additional recordings of clear or Lombard speech. To further improve intelligibility we combined our method with noise-independent enhancement approaches based on the acoustics of highly intelligible speech. This combined solution was as effective for stationary noise as for the challenging competing speaker scenario, obtaining up to 4dB of equivalent intensity gain. Finally, we proposed an extension to the speech enhancement paradigm to account for not only energetic masking of signals but also for linguistic confusability of words in sentences. We found that word level confusability, a challenging value to predict, can be used as an additional prior to increase intelligibility even for simple enhancement methods like energy reallocation between words. These findings motivate further research into solutions that can tackle the effect of energetic masking on the auditory system as well as on higher levels of processing.
64

Impact of breath group control on the speech of normals and individuals with cerebral palsy

Yip, Fiona Pik Ying January 2008 (has links)
Dysarthria is one of the most common signs of speech impairment in the cerebral palsy (CP) population. Facilitating strategies for speech enhancement in this population often include training on speech breathing. Treatment efficacy studies with cross-system measures in this population are needed for improved understanding and management of the interrelationship between respiratory, phonatory, and articulatory systems. The purpose of this study was to investigate the effect of breath group control on the coordination of articulatory and phonatory muscles and the acoustic measures related to speech and voice quality. A simultaneous acoustic, electroglottographic (EGG), and marker-based facial tracking recording system was employed to monitor the speech production behaviors of four adults with CP and 16 neurologically healthy controls. Subjects were instructed to perform three tasks, each containing speech targets with a voiceless plosive (/p/, /t/, or /k/) preceding a vowel (/i/, /a/, /u/, or /ɔ/). Task 1 consisted of a short reading passage embedded with target vowels without cueing from breath group markers. Task 2 included reading a series of monosyllabic and 3-syllable or 5-syllable non-speech words with the speech targets. Task 3 included reading the same short passage from Task 1 with cueing from breath group markers separating the passage into phrases with no more than five syllables per phrase. Measures from the acoustic, EGG and facial tracking recordings of the first and last syllable of all syllable trains produced in the non-speech task and the target vowels in the passage reading task were examined. Acoustic measures included voice onset time (VOT), vowel duration, fundamental frequency (F0), percent jitter (%jitter), percent shimmer (%shimmer), signal-to-noise ratio (SNR), and frequencies of Formants one and two (F1 and F2). EGG measures included speed quotient (SQ) and open quotient (OQ). Facial tracking measures consisted of maximum jaw displacement. Individual and averaged data were submitted to a series of two-way Analysis of Variances (ANOVAs) or two-way Repeated Measures ANOVAs to determine the effects of the relative position of an utterance in the breath group and the place of articulation of the consonants involved. In addition, mean vowel spaces derived from all three tasks were examined. Results revealed significant changes of VOT, F1, F2, SNR and SQ as a function of position. Significant changes of VOT, vowel duration, F2, F0, %jitter, %shimmer, and maximum jaw displacement as a function of place of articulation were also evident. In particular, breath group control was found to result in expansion of vowel space, especially for individuals with CP. These findings suggest that proper phrasing enhances articulatory and phonatory stability, providing empirical evidences in support of its usage in treating individuals with CP.
65

Speech intelligibility in noise of normal-hearing and hearing-impaired individuals wearing E-A-R plugs

Wade, Mary A. January 1986 (has links)
Call number: LD2668 .T4 1986 W23 / Master of Arts / Communication Studies
66

An electronic device to reduce the dynamic range of speech

Hildebrant, Eric Michael January 1982 (has links)
Thesis (B.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1982. / MICROFICHE COPY AVAILABLE IN ARCHIVES AND ENGINEERING / Bibliography: leaves 90-92. / by Eric Michael Hildebrant. / B.S.
67

The effects of intensive voice treatment on speech intelligibility and acoustics of Mandarin speakers with hypokinetic dysarthria due to Parkinson’s disease

Hsu, Sih-Chiao January 2017 (has links)
Hypokinetic dysarthria is a speech disorder that commonly occurs in individuals with Parkinson’s disease (PD). However, little is known about the speech characteristics and the effects of speech treatment on the speech of Mandarin speakers with hypokinetic dysarthria (henceforth, Mandarin speakers with PD). The purpose of this dissertation was to investigate the effects of intensive voice treatment on the speech intelligibility and acoustics of this population. This dissertation consisted of three papers. The first paper, “Acoustic and perceptual speech characteristics of native Mandarin speakers with Parkinson’s disease,” investigated the general speech characteristics of 11 Mandarin speakers with PD. Intelligibility and acoustic outcomes were reported and compared to seven age- and gender-matched neurologically healthy controls. Findings from this study showed that Mandarin speakers with PD exhibited decreased intelligibility, local pitch variation, vowel space area, speech rate, and rate variation. The second paper, “Effects of Loudness and Rate Manipulation Strategies on Speech Intelligibility and Acoustics of Mandarin Speakers With Parkinson’s Disease,” examined the effects of cueing to increase loudness and reduce speech rate on speech intelligibility and acoustics. Acoustic features including speech intensity, pitch range, pause duration, pause frequency, articulation rate, and vowel space area across 11 Mandarin speakers with PD were analyzed. The relationship between speech intelligibility and acoustic features was reported. Results showed that cueing for loud speech significantly increased intelligibility, but cueing for slow speech did not. Different cues had differential effects on the selected acoustic features. Cueing for loud speech resulted in increased vocal intensity and cueing for slow speech resulted in reduced articulation rate and increased pause frequency. In the loud speaking condition, greater vocal intensity and larger vowel space contributed to increased intelligibility, whereas in the slow condition, increased intensity, vowel space, as well as articulation rate, showed a trend toward contributing to increased intelligibility. The third paper, “The Effects of Intensive Voice Treatment on Intelligibility in Mandarin Speakers with Parkinson’s Disease: Acoustic and perceptual findings,” investigated the short- and long-term effects of intensive voice treatment (Lee Silverman Voice Treatment LOUD) on speech intelligibility and acoustics of nine Mandarin speakers with PD. All speakers showed increased intelligibility from pretreatment to immediate post-treatment, and the improvement was maintained at the 6-month follow-up. Five acoustic features were analyzed. Speech intensity, vowel space, and speech rate changed significantly in positive directions immediately post-treatment, and the increases were retained up to six months. Global pitch variation increased immediately post-treatment but not at the 6-month follow-up. No changes were found in local pitch variation following treatment. Self-reported intelligibility, voice quality, confidence, frustration level, and communicative participation changed positively immediately after the completion of treatment and at the 6-month follow-up. To conclude, the speech characteristics of Mandarin speakers with PD were generally consistent with those of English speakers with PD, except that speech was slower in the Mandarin speakers. Cueing to increase loudness and reduce rate had different effects on speech intelligibility and production, with louder speech yielding greater intelligibility and acoustic benefits. Following intensive voice treatment (LSVT LOUD), Mandarin speakers with PD increased their vocal intensity. Speech intelligibility, vowel space, global pitch variation and speech rate increased as a result of the treatment. Thus, some differences between Mandarin and English dysarthria and effects of cueing might be present, but as for English speakers, intensive treatment (specifically LSVT LOUD) focusing on increasing vocal intensity shows promise for increasing intelligibility and quality of life in Mandarin speakers with hypokinetic dysarthria. Future studies should include a larger number of participants and probe the effects of behavioral speech modifications and intensive voice treatment on lexical tone, and consider which physiological mechanisms might be associated with production of lexical tone, given that lexical tone is often crucial to differentiating word meaning in Mandarin.
68

A Study of the Correlation between the Articulation Competence Index (ACI) and the Percentage of Words Understood in the Continuous Speech of 4- and 5-year-olds of Varying Phonological Competence

Mitchell, Susan Coll 10 June 1996 (has links)
Intelligibility refers to how recognizable a speaker's words are to the listener. Severity, a broader but closely related concept, incorporates intelligibility, disability, and handicap. Many factors influence intelligibility, including speech sound production, voice, and prosody, as well as a number of linguistic and contextual factors. Clinicians and researchers in the field of speechlanguage pathology require accurate measures of intelligibility and severity to assess and describe communicative functioning and to measure change over time. Determining the most accurate and efficient measurement approaches has been the focus of recent attention in the field. This study was a preliminary investigation of the relationship between the Articulation Competence Index (ACI), a severity metric, and the percentage of words understood in continuous speech, the standard measure of intelligibility. Specifically, the study addressed the research question: Is there a significant correlation between the Articulation Competence Index (ACI) and percentage of words understood in samples of continuous speech of 4- and 5-year-olds with varying levels of phonological competence? Subjects were thirty 4- and 5-year-olds from the Portland metropolitan area. Four listeners calculated percentage-of-words scores for each child's 100-word speech sample. These scores were compared to ACI scores calculated by the investigator for each of the samples. The data were analyzed using the Pearson productmoment correlation (Pearson£). A moderately strong correlation (£ = .71 to .81) was found between the ACI and percentage of words understood. Squaring the correlation coefficients resulted in values for £ 2 of .50 to .66, indicating that the ACI accounts for more than half the variability of continuous speech intelligibility.
69

A Pilot Study: Normative Data on the Intelligibility of 3 1/2 Year Old Children

Ware, Karen Mary 05 November 1996 (has links)
Most of the previous published research involving intelligibility has focused on persons with various disabilities or delays. Minimal research has been conducted on intelligibility in young children with no diagnosed speech and/ or language disorders. The result is a gap in normative data by which to set a standard to judge speech as being at an acceptable level of intelligibility for a particular age group. The focus of this pilot study was to collect normative data on the intelligibility of young children, ages 3:6 ±2 months, with no diagnosed speech and/or language disorder. ~ Thirteen subjects, ages 3:6 ±2 months, were recruited from the greater Portland/Vancouver area. These subjects were screened for normal development in speech sound production, expressive/receptive language, and hearing. It was also established that English was the primary language spoken in the home. Resonance, voice quality, and fluency were informally assessed by the researcher during the course of the session and found to be normal. The 100-word speech samples were collected by the researcher on audiotape and later played back to two listeners, who were familiar with the topic but unfamiliar with the speaker. The listeners orthographically transcribed the samples and a comparison was made by the researcher between the two sets of written transcriptions. This comparison provided the percentage of intelligible words, out of a possible 100, which were understood by both listeners. The results showed the mean intelligibility percentage for 31/2-year-old children with no diagnosed speech and/or language disorders to be 88% (SD = 5.7%) with a range of intelligibility from 76% to 96 % . Both the mode and the median for this sample were 90 % . Several other variables were addressed as points of interest but the comparisons were not investigated in depth. The focus of this study was to collect, in a methodically documented manner, normative data on intelligibility in 3 1/2-year-olds. When the results from this study are compared to the only other available data (Weiss, 1982), they were found to fall within 1 SD of each other (SD = 5.7%), indicating that there are no measurable differences between the findings.
70

The Effects of Phonological Processes on the Speech Intelligibility of Young Children

Shotola-Hardt, Susanne 20 October 1994 (has links)
The purpose of this study was to explore the relationship between occurrence of 10 phonological processes, singly and in groups, with mean percentage of intelligibility of connected speech samples. Participants in the study included 4 adult listeners (3 females, one male) and 46 speakers aged 48 to 66 months (16 females, 30 males). Percentage of occurrence scores for phonological processes (independent variables) were obtained by the administration of The Assessment of Phonological Processes - Revised (Hodson, 1986). Percentage of intelligibility for 100-word connected speech samples (dependent variables) were obtained by orthographic transcription (words understood divided by 100). The single processes showing the strongest negative correlation with intelligibility of connected speech included consonant sequence omission, glide class deficiency, syllable omission, and velar class deficiency, with reliability beyond the .001 level. The combination of consonant sequence omission, syllable omission, nasal class deficiency, and velar class deficiency accounted for 83% of the variance in the dependent variable. In this equation, consonant sequence omission alone accounted for 70% of the variance. Significance is beyond the .05 level for these measures. Results of the study lead to the recommendation that the following phonological processes are high priority targets for remediation: consonant sequence omission, syllable reduction and glide class deficiency, syllable reduction, and velar class deficiency.

Page generated in 0.0438 seconds