31 |
Effects of noise reduction on speech intelligibility / Les effets des réducteurs de bruit sur l’intelligibilité de la paroleHilkhuysen, Gaston 22 September 2017 (has links)
On perçoit souvent la parole en présence de bien d’autres sons. Parfois les interférences sonores atteignent des niveaux tellement élevés que la parole devient inintelligible. Les méthodes de renforcement de la parole tentent de réduire les bruits ambiants, mais on en sait très peu sur l’effet qu’elles produisent sur l’intelligibilité de la parole. Cette thèse explore les effets des méthodes de renforcement de la parole, aussi appelées algorithmes de suppression du bruit, sur la l’intelligibilité.Après une brève introduction sur les notions de renforcement de la parole et d’intelligibilité, on présente trois études qui abordent les effets de ces méthodes d’un point de vue empirique. On démontre que le résultat de la suppression du bruit tend à réduire l’intelligibilité et que cet effet est constant pour une grande variété de niveaux sonores. Quand on fait appel à des experts pour mettre en place un système commercial de suppression du bruit dans le but d’améliorer l’intelligibilité, ils proposent des réglages qui dégradent l’intelligibilité. Les profanes perçoivent bien une amélioration de l’intelligibilité qui résulte des méthodes de renforcement de la parole.Trois autres études subséquentes tentent de préciser les propriétés du signal, qui ont des effets sur l’intelligibilité et qui sont généré par les méthodes de renforcement de la parole. Des métriques physiques basées sur différentes propriétés du signal ont été utilisées pour estimer l’intelligibilité de la parole renforcée. La plupart de ces mesures fournissent des estimations peu fiables ou biaisées de l’intelligibilité absolue. / Speech is often perceived in the presence of other sounds. At times the interfering sounds can reach such high levels that the speech becomes unintelligible. Speech enhancement methods attempt to reduce the audibility of noisy sounds, but little is known about how their influence on intelligibility. This thesis explores the effects of speech enhancement, also known as noise suppression algorithms, on speech intelligibility. After a short introduction to speech enhancement and intelligibility, three studies consider the effects from an empirical perspective. It is shown that noise suppression tends to reduce intelligibility and that its effect is mostly constant across a broad range of noise levels. When experts were asked to apply a commercial noise suppressor to optimise intelligibility, they proposed settings that degraded intelligibility. Laypeople successfully identified an increase in intelligibility resulting from speech enhancement. Three subsequent studies attempt to identify the signal properties responsible for the intelligibility effects and generated by speech enhancement.Physical metrics based on various signal properties were used to estimate the intelligibility of the speech-enhanced noisy signal. Most metrics provided unreliable or biased estimates of absolute intelligibility. Some could nevertheless be used to adjust speech enhancers such that intelligibility is optimal.
|
32 |
A Correlational Study: The 1-minute Measure of Homonymy and IntelligibilityDay, Tamra Leanne 06 June 1995 (has links)
Identifying the severity level of unintelligibility objectively and efficiently holds critical clinical implications for speech assessment and intervention needs. The speech of children who demonstrate phonological deviations is frequently unintelligible. The use of an accurate and time-efficient measurement of intelligibility is necessary to screen children who may be producing phonological patterns that contribute to significantly reduced intelligibility in connected speech. The purpose of this study was to investigate the degree of concurrent validity between scores received on the 1-Minute Measure of Homonymy and Intelligibility (Hodson, 1992) and speech intelligibility as measured by the percent of words understood in connected speech. For this investigation, intelligibility is operationally defined as the percent of words understood in a connected speech sample derived from orthographic transcription. Data collected were from 48 children, aged 4:0 to 5:6, who demonstrated varying levels of phonological proficiency/deficiency. A group of four listeners who had experience treating children with phonological disorders were responsible for completing orthographic transcriptions of the 48 connected speech samples. The two methods of assessing speech intelligibility investigated in this study were found to correlate highly (r = .84). This is considered a significant statistical correlation and therefore the 1-Minute Measure may be used to provide speech-language pathologists with valuable information to predict a child's intelligibility level in connected speech. A regression formula was employed to predict percentage of intelligibility when presented with a child's 1- Minute Measure score. Results from this correlational study suggest that the 1- Minute Measure of Homonymy and Intelligibility may serve as an assessment tool that can provide a speech-language pathologist with some valuable information pertaining to a child's level of intelligibility in connected speech. When used with another speech assessment tool, the 1-Minute Measure may function as a screening measure to identify preschoolers who produce phonological deviations that interfere with intelligibility of conversational speech.
|
33 |
Speech intelligibility in ALS and HD dysarthria : everyday listener perspectives of barriers and strategies /Klasner, Estelle R. January 2003 (has links)
Thesis (Ph. D.)--University of Washington, 2003. / Vita. Includes bibliographical references (leaves 77-84).
|
34 |
A preliminary study of the frequency importance function of Cantonese sentencesHo, Shun-yee, Amy., 何舜儀. January 2002 (has links)
published_or_final_version / toc / Speech and Hearing Sciences / Master / Master of Science in Audiology
|
35 |
Speech recognition predictability of a Cantonese speech intelligibility indexChua, W. W., 蔡蕙慧. January 2004 (has links)
published_or_final_version / abstract / toc / Speech and Hearing Sciences / Master / Master of Science in Audiology
|
36 |
Emotion in Speech: Recognition by Younger and Older Adults and Effects on IntelligibilityDupuis, Katherine Lise 06 January 2012 (has links)
Spoken language conveys two forms of information: transactional (content, what is said) and interactional (how it is said). The transactional message shared during spoken communication has been studied extensively in different listening conditions and in people of all ages using standardized tests of speech intelligibility. However, research into interactional aspects of speech has been more limited. One specific aspect of interactional communication that warrants further investigation is the communication of emotion in speech, also called affective prosody.
A series of experiments examined how younger and older adults produce affective prosody, recognize emotion in speech, and understand emotional speech in noise. The emotional valence and arousal properties of target words from an existing speech intelligibility test were rated by younger and older adults. New stimuli based on those words were recorded by a younger female and an older female using affective prosody to portray seven emotions (anger, disgust, fear, happiness, pleasant surprise, sadness, neutral). Similar to previous studies, the acoustical parameter that best differentiated the emotions was fundamental frequency (F0). Specifically, discriminant analysis indicated that emotional category membership was best predicted by the mean and range of F0.
Overall, recognition of emotion and intelligibility were high. While older listeners made more recognition errors and had poorer intelligibility overall, their patterns of responding did not differ significantly from those of the younger listeners on either measure. Of note, angry and sad emotions were recognized with the highest degree of accuracy, but intelligibility was highest for items spoken to portray fear or pleasant surprise. These results may suggest that there is a complementarity between the acoustic cues used to recognize emotions (how words are said) and those used to understand words (what is said). Alternatively, the effect of emotion on intelligibility may be modulated primarily by attentional rather than acoustical factors, with higher performance associated with alerting emotions.
|
37 |
Objective Assessment of Dysarthric Speech IntelligibilityHUMMEL, RICHARD 28 September 2011 (has links)
The de-facto standard for dysarthric intelligibility
assessment is a subjective intelligibility test, performed by an expert.
Subjective tests are often costly, biased and inconsistent because of their perceptual nature.
Automatic objective assessment methods, in contrast, are repeatable and
relatively cheap. Objective methods can be broken down into two subcategories:
reference-free, and reference based. Reference-free methods employ estimation procedures that do not require information about the target speech material. This potentially makes the problem more difficult, and
consequently, there is a deficit of research into reference-free dysarthric intelligibility estimation.
In this thesis, we focus on the reference-free intelligibility estimation approach.
To make the problem more tractable, we focus on the dysarthrias of cerebral palsy (CP). First, a popular standard for blind speech quality estimation, the ITU-T P.563 standard, is examined for possible application to dysarthric intelligibility estimation. The internal structure of the standard is discussed, along with the relevance of its internal features to intelligibility estimation. Afterwards, several novel features expected to relate to some of the acoustic properties of dysarthric speech are proposed. Proposed features are based on the high-order statistics of parameters derived from linear prediction (LP) analysis, and a mel-frequency filterbank.
In order to gauge the complimentariness of P.563 and proposed features, a linear intelligibility model is proposed and tested. Intelligibility is expressed as a linear combination of acoustic features, which are selected from a feature pool using speaker-dependent and speaker-independent validation methods. An intelligibility estimator constructed with only P.563 features serves as the `baseline'. When proposed features are added to the feature pool, performance is shown to improve substantially for both speaker-dependent and speaker-independent methods when compared to the baseline. Results are also shown to compare favourably with those reported in the literature. / Thesis (Master, Electrical & Computer Engineering) -- Queen's University, 2011-09-28 18:44:51.103
|
38 |
Acoustic and Perceptual Evaluation of the Quality of Radio-Transmitted SpeechKirtikar, Shantanu Sanatkumar January 2010 (has links)
Aim
When speech signals are transmitted via radio, the process of transmission may add noise to the signal of interest. This study aims to examine the effect of radio transmission on the quality of speech signals transmitted using a combined acoustic and perceptual approach.
Method
A standard acoustic recording of the Phonetically Balanced Kindergarten (PBK) word list read by a male speaker was played back in three conditions, one without radio transmission and two with two types of radio transmission. The vowel segments (/i, a, o, u/) embedded in the original and the re-recorded signals were analysed to yield measures of frequency loci of the first two formant frequencies (F1 and F2), amplitude difference between the first two harmonics (H1-H2), and singing power ratio (SPR). Other measures included Spectral Moment One (mean), Spectral Moment Two (variance), and the energy ratio between consonant and vowel (CV energy ratio). To examine how H1-H2 and SPR were related to the perception of vowel intelligibility and clarity, vowels at five levels of each of these two measures were selected as stimuli in the perceptual study. The auditory stimuli were presented to 20 normal hearing listeners, including 10 males and 10 females aged between 21 to 42 years, the listeners were asked to identify the vowel for each vowel stimulus in the vowel identification task and judge from a contrast pair which vowel sounded “clearer” in the clarity discrimination task. A follow-up study using vowel stimuli with a constant length and five H1-H2 or five SPR levels was conducted on five listeners to determine the relationship between the perception of speech clarity and H1-H2 or SPR.
Results
Results from a series of one-way or two-way analyses of variance (ANOVAs) or ANOVAs on Ranks and post-hoc test revealed that radio transmission had a significant effect on all of the selected acoustic measures except for the CV energy ratio. Signal degeneration due to radio transmission is characterized by changes of F1 or F2 frequencies toward a more compressed vowel space, a H1-H2 value indicating an increase of H1 dominance, a SPR value suggestive of an increase in the energy around the 2-4 kHz region, and a loss of differentiation between /s/ and /sh/ on the measures of Spectral Moments One and Two. Vowel duration was also found to play a major role in affecting the perception of vowel intelligibility and clarity. The follow-up study, with a control on vowel duration, found that SPR played a role in affecting the perception of vowel intelligibility and clarity.
Conclusion
It was concluded from the findings that measures of energy ratio between different frequency regions, as well as the frequencies of the first two formant frequencies, were sensitive in detecting the effect of radio transmission.
|
39 |
The effects of speech patterns on listening comprehensionRogers, Minnie M. January 1972 (has links)
This study was undertaken in an effort to determine the effect of compensatory education on achievement and the self concepts of students in inner city schools. The subjects for this study were chosen from the third, fourth, and fifth grades of the Lincoln, Longfellow, Blaine, and Garfield public elementary schools of Muncie, Indiana. The experimental group received compensatory treatment which consisted of remedial reading, tutorial aid, and counseling, while the control group received the standard type of education given by the schools involved in the study. Both groups were selected by classroom teachers on the basis of personal judgment with no specific criteria given for the selection.Academic achievement was measured by the results of the Iowa Basic Achievement Test. This test was given twice (pre- and post-test) to both the control and experimental groups in grades four, and five. Grade three had been administered the Metropolitan Achievement Test (MAT) as a pre-test the previous spring at the end of grade two. Grade three was tested by the Iowa Basic Achievement Test in a post-test the spring of 1973.Self concept was measured by the results of the test by Waetjen and Liddle, Self Concept as a Learner (SCAL). This test was given twice to both the control and experimental groups; the' pre-test in the fall of 1972 and the post-test in the spring of 1973.The results were used to evaluate the eight basic hypotheses. Statistical analysis of the results led to rejecting only one hypothesis. Hypothesis 7 was rejected at the .05 level of significance.In general, any gains shown by the experimental group over the control group were of small statistical magnitude, whether in the area of academic achievement, reading achievement or self concept. The same may be said of any of the differences between the various schools, grades, and class groups. No strong relationship between compensatory education and the probability of success could be clearly established from the data.A strong relationship was established, however, between compensatory and the probability of success for grade three. Since the impact of counseling, tutorial aid, or remedial reading was not analyzed separately, this relationship was attributed to the compensatory treatment as a whole and specifically to any one part of the program.
|
40 |
Emotion in Speech: Recognition by Younger and Older Adults and Effects on IntelligibilityDupuis, Katherine Lise 06 January 2012 (has links)
Spoken language conveys two forms of information: transactional (content, what is said) and interactional (how it is said). The transactional message shared during spoken communication has been studied extensively in different listening conditions and in people of all ages using standardized tests of speech intelligibility. However, research into interactional aspects of speech has been more limited. One specific aspect of interactional communication that warrants further investigation is the communication of emotion in speech, also called affective prosody.
A series of experiments examined how younger and older adults produce affective prosody, recognize emotion in speech, and understand emotional speech in noise. The emotional valence and arousal properties of target words from an existing speech intelligibility test were rated by younger and older adults. New stimuli based on those words were recorded by a younger female and an older female using affective prosody to portray seven emotions (anger, disgust, fear, happiness, pleasant surprise, sadness, neutral). Similar to previous studies, the acoustical parameter that best differentiated the emotions was fundamental frequency (F0). Specifically, discriminant analysis indicated that emotional category membership was best predicted by the mean and range of F0.
Overall, recognition of emotion and intelligibility were high. While older listeners made more recognition errors and had poorer intelligibility overall, their patterns of responding did not differ significantly from those of the younger listeners on either measure. Of note, angry and sad emotions were recognized with the highest degree of accuracy, but intelligibility was highest for items spoken to portray fear or pleasant surprise. These results may suggest that there is a complementarity between the acoustic cues used to recognize emotions (how words are said) and those used to understand words (what is said). Alternatively, the effect of emotion on intelligibility may be modulated primarily by attentional rather than acoustical factors, with higher performance associated with alerting emotions.
|
Page generated in 0.0867 seconds