abstract: Speech is generated by articulators acting on
a phonatory source. Identification of this
phonatory source and articulatory geometry are
individually challenging and ill-posed
problems, called speech separation and
articulatory inversion, respectively.
There exists a trade-off
between decomposition and recovered
articulatory geometry due to multiple
possible mappings between an
articulatory configuration
and the speech produced. However, if measurements
are obtained only from a microphone sensor,
they lack any invasive insight and add
additional challenge to an already difficult
problem.
A joint non-invasive estimation
strategy that couples articulatory and
phonatory knowledge would lead to better
articulatory speech synthesis. In this thesis,
a joint estimation strategy for speech
separation and articulatory geometry recovery
is studied. Unlike previous
periodic/aperiodic decomposition methods that
use stationary speech models within a
frame, the proposed model presents a
non-stationary speech decomposition method.
A parametric glottal source model and an
articulatory vocal tract response are
represented in a dynamic state space formulation.
The unknown parameters of the
speech generation components are estimated
using sequential Monte Carlo methods
under some specific assumptions.
The proposed approach is compared with other
glottal inverse filtering methods,
including iterative adaptive inverse filtering,
state-space inverse filtering, and
the quasi-closed phase method. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2018
Identifer | oai:union.ndltd.org:asu.edu/item:51762 |
Date | January 2018 |
Contributors | Venkataramani, Adarsh Akkshai (Author), Papandreou-Suppappola, Antonia (Advisor), Bliss, Daniel W (Committee member), Turaga, Pavan (Committee member), Arizona State University (Publisher) |
Source Sets | Arizona State University |
Language | English |
Detected Language | English |
Type | Masters Thesis |
Format | 69 pages |
Rights | http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.0022 seconds