Spelling suggestions: "subject:"sourcefilter interaction"" "subject:"sources:jitter interaction""
1 |
The Effect of Nonlinear Source-Filter Interaction on Aerodynamic Measures in a Synthetic Model of the Vocal Folds and Vocal TractMay, Nicholas A. 01 June 2022 (has links)
No description available.
|
2 |
The Voice Source in Speech Communication - Production and Perception Experiments Involving Inverse Filtering and SynthesisGobl, Christer January 2003 (has links)
This thesis explores, through a number of production andperception studies, the nature of the voice source signal andhow it varies in spoken communication. Research is alsopresented that deals with the techniques and methodologies foranalysing and synthesising the voice source. The main analytictechnique involves interactive inverse filtering for obtainingthe source signal, which is then parameterised to permit thequantification of source characteristics. The parameterisationis carried by means of model matching, using the four-parameterLF model of differentiated glottal flow. The first three analytic studies focus on segmental andsuprasegmental determinants of source variation. As part of theprosodic variation of utterances, focal stress shows for theglottal excitation an enhancement between the stressed voweland the surrounding consonants. At a segmental level, the voicesource characteristics of a vowel show potentially majordifferences as a function of the voiced/voiceless nature of anadjacent stop. Cross-language differences in the extent anddirectionality of the observed effects suggest differentunderlying control strategies in terms of the timing of thelaryngeal and supralaryngeal gestures, as well as in thelaryngeal tensions settings. Different classes of voicedconsonants also show differences in source characteristics:here the differences are likely to be passive consequences ofthe aerodynamic conditions that are inherent to the consonants.Two further analytic studies present voice source correlatesfor six different voice qualities as defined by Laver'sclassification system. Data from stressed and unstressedcontexts clearly show that the transformation from one voicequality to another does not simply involve global changes ofthe source parameters. As well as providing insights into theseaspects of speech production, the analytic studies providequantitative measures useful in technology applications,particularly in speech synthesis. The perceptual experiments use the LF source implementationin the KLSYN88 synthesiser to test some of the analytic resultsand to harness them to explore the paralinguistic dimension ofspeech communication. A study of the perceptual salience ofdifferent parameters associated with breathy voice indicatesthat the source spectral slope is critically important andthat, surprisingly, aspiration noise contributes relativelylittle. Further perceptual tests using stimuli with differentvoice qualities explore the mapping between voice quality andits paralinguistic function of expressing emotion, mood andattitude. The results of these studies highlight the crucialrole of voice quality in expressing affect as well as providingpointers to how it combines withf0for this purpose. The last section of the thesis focuses on the techniquesused for the analysis and synthesis of the source. Asemi-automatic method for inverse filtering is presented, whichis novel in that it optimises the inverse filter by exploitingthe knowledge that is typically used by the experimenter whencarrying out manual interactive inverse filtering. A furtherstudy looks at the properties of the modified LF model in theKLSYN88 synthesiser: it highlights how it differs from thestandard LF model and discusses the implications forsynthesising the glottal source signal from LF model data.Effective and robust source parameterisation for the analysisof voice quality is the topic of the final paper: theeffectiveness of global, amplitude-based, source parameters isexamined across speech tokens with large differences inf0. Additional amplitude-based parameters areproposed to enable a more detailed characterisation of theglottal pulse. <b>Keywords:</b>Voice source dynamics, glottal sourceparameters, source-filter interaction, voice quality,phonation, perception, affect, emotion, mood, attitude,paralinguistic, inverse filtering, knowledge-based, formantsynthesis, LF model, fundamental frequency,f0.
|
3 |
Nonlinear Interactive Source-filter Model For Voiced SpeechKoc, Turgay 01 October 2012 (has links) (PDF)
The linear source-filter model (LSFM) has been used as a primary model for speech processing
since 1960 when G. Fant presented acoustic speech production theory. It assumes
that the source of voiced speech sounds, glottal flow, is independent of the filter, vocal tract.
However, acoustic simulations based on the physical speech production models show that,
especially when the fundamental frequency (F0) of source harmonics approaches to the first
formant frequency (F1) of vocal tract filter, the filter has significant effects on the source due
to the nonlinear coupling between them. In this thesis, as an alternative to linear source-filter
model, nonlinear interactive source-filter models are proposed for voiced speech.
This thesis has two parts, in the first part, a framework for the coupling of the source and the
filter is presented. Then, two interactive system models are proposed assuming that glottal
flow is a quasi-steady Bernoulli flow and acoustics in vocal tract is linear. In these models,
instead of glottal flow, glottal area is used as a source for voiced speech. In the proposed interactive
models, the relation between the glottal flow, glottal area and vocal tract is determined
by the quasi-steady Bernoulli flow equation. It is theoretically shown that linear source-filter
model is an approximation of the nonlinear models. Estimation of ISFM&rsquo / s parameters from only speech signal is a nonlinear blind deconvolution problem. The problem is solved by a
robust method developed based on the acoustical interpretation of the systems. Experimental
results show that ISFMs produce source-filter coupling effects seen in the physical simulations
and the parameter estimation method produce always stable and better performing
models than LSFM model. In addition, a framework for the incorporation of the source-filter
interaction into classical source-filter model is presented. The Rosenberg source model is extended
to an interactive source for voiced speech and its performance is evaluated on a large
speech database. The results of the experiments conducted on vowels in the database show
that the interactive Rosenberg model is always better than its noninteractive version.
In the second part of the thesis, LSFM and ISFMs are compared by using not only the speech
signal but also HSV (High Speed Endocopic Video) of vocal folds in a system identification
approach. In this case, HSV and speech are used as a reference input-output data for
the analysis and comparison of the models. First, a new robust HSV processing algorithm is
developed and applied on HSV images to extract the glottal area. Then, system parameters
are estimated by using a modified version of the method proposed in the first part. The experimental
results show that speech signal can contain some harmonics of the fundamental
frequency of the glottal area other than those contained in the glottal area signal. Proposed
nonlinear interactive source-filter models can generate harmonics components in speech and
produce more realistic speech sounds than LSFM.
|
4 |
The Voice Source in Speech Communication - Production and Perception Experiments Involving Inverse Filtering and SynthesisGobl, Christer January 2003 (has links)
<p>This thesis explores, through a number of production andperception studies, the nature of the voice source signal andhow it varies in spoken communication. Research is alsopresented that deals with the techniques and methodologies foranalysing and synthesising the voice source. The main analytictechnique involves interactive inverse filtering for obtainingthe source signal, which is then parameterised to permit thequantification of source characteristics. The parameterisationis carried by means of model matching, using the four-parameterLF model of differentiated glottal flow.</p><p>The first three analytic studies focus on segmental andsuprasegmental determinants of source variation. As part of theprosodic variation of utterances, focal stress shows for theglottal excitation an enhancement between the stressed voweland the surrounding consonants. At a segmental level, the voicesource characteristics of a vowel show potentially majordifferences as a function of the voiced/voiceless nature of anadjacent stop. Cross-language differences in the extent anddirectionality of the observed effects suggest differentunderlying control strategies in terms of the timing of thelaryngeal and supralaryngeal gestures, as well as in thelaryngeal tensions settings. Different classes of voicedconsonants also show differences in source characteristics:here the differences are likely to be passive consequences ofthe aerodynamic conditions that are inherent to the consonants.Two further analytic studies present voice source correlatesfor six different voice qualities as defined by Laver'sclassification system. Data from stressed and unstressedcontexts clearly show that the transformation from one voicequality to another does not simply involve global changes ofthe source parameters. As well as providing insights into theseaspects of speech production, the analytic studies providequantitative measures useful in technology applications,particularly in speech synthesis.</p><p>The perceptual experiments use the LF source implementationin the KLSYN88 synthesiser to test some of the analytic resultsand to harness them to explore the paralinguistic dimension ofspeech communication. A study of the perceptual salience ofdifferent parameters associated with breathy voice indicatesthat the source spectral slope is critically important andthat, surprisingly, aspiration noise contributes relativelylittle. Further perceptual tests using stimuli with differentvoice qualities explore the mapping between voice quality andits paralinguistic function of expressing emotion, mood andattitude. The results of these studies highlight the crucialrole of voice quality in expressing affect as well as providingpointers to how it combines with<i>f</i><sub>0</sub>for this purpose.</p><p>The last section of the thesis focuses on the techniquesused for the analysis and synthesis of the source. Asemi-automatic method for inverse filtering is presented, whichis novel in that it optimises the inverse filter by exploitingthe knowledge that is typically used by the experimenter whencarrying out manual interactive inverse filtering. A furtherstudy looks at the properties of the modified LF model in theKLSYN88 synthesiser: it highlights how it differs from thestandard LF model and discusses the implications forsynthesising the glottal source signal from LF model data.Effective and robust source parameterisation for the analysisof voice quality is the topic of the final paper: theeffectiveness of global, amplitude-based, source parameters isexamined across speech tokens with large differences in<i>f</i><sub>0</sub>. Additional amplitude-based parameters areproposed to enable a more detailed characterisation of theglottal pulse.</p><p><b>Keywords:</b>Voice source dynamics, glottal sourceparameters, source-filter interaction, voice quality,phonation, perception, affect, emotion, mood, attitude,paralinguistic, inverse filtering, knowledge-based, formantsynthesis, LF model, fundamental frequency,<i>f</i><sub>0</sub>.</p>
|
Page generated in 0.1709 seconds