Return to search

Intelligibility of synthetic speech in noise and reverberation

Synthetic speech is a valuable means of output, in a range of application contexts, for people with visual, cognitive, or other impairments or for situations were other means are not practicable. Noise and reverberation occur in many of these application contexts and are known to have devastating effects on the intelligibility of natural speech, yet very little was known about the effects on synthetic speech based on unit selection or hidden Markov models. In this thesis, we put forward an approach for assessing the intelligibility of synthetic and natural speech in noise, reverberation, or a combination of the two. The approach uses an experimental methodology consisting of Amazon Mechanical Turk, Matrix sentences, and noises that approximate the real-world, evaluated with generalized linear mixed models. The experimental methodologies were assessed against their traditional counterparts and were found to provide a number of additional benefits, whilst maintaining equivalent measures of relative performance. Subsequent experiments were carried out to establish the efficacy of the approach in measuring intelligibility in noise and then reverberation. Finally, the approach was applied to natural speech and the two synthetic speech systems in combinations of noise and reverberation. We have examine and report on the intelligibility of current synthesis systems in real-life noises and reverberation using techniques that bridge the gap between the audiology and speech synthesis communities and using Amazon Mechanical Turk. In the process, we establish Amazon Mechanical Turk and Matrix sentences as valuable tools in the assessment of synthetic speech intelligibility.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:688001
Date January 2015
CreatorsIsaac, Karl Bruce
ContributorsRenals, Stephen ; Wolters, Maria
PublisherUniversity of Edinburgh
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://hdl.handle.net/1842/15870

Page generated in 0.0026 seconds