Return to search

Prosodic properties of formality in spoken Japanese

This thesis investigates the relationship between prosody and formality in spoken Japanese, from the standpoints of both speech production and perception. The previous literature on this topic has often produced inconsistent or contradictory results (e.g. Loveday, 1981; Ofuka at al., 2000; Ito, 2001; Ito, 2002), and this thesis therefore seeks to address the research question of whether speakers and listeners use prosody in any predictable way when expressing or judging formality in spoken Japanese. Chapter 2 describes a pilot study which aimed to determine which prosodic variables were worth investigating in a larger corpus-based study. Speech of different levels of formality was elicited from subjects indirectly via the inclusion of indexical linguistic items in carrier sentences. Analysis of the relationship between mean f<sub>0</sub> and duration shows a significant correlation with the categories of formal and informal speech where both variables are higher in informal speech. Consequently, in Chapter 3 f<sub>0</sub> and articulation rate were analyzed in the corpus-based study. Corpus data for the study was collected via one-on-one conversations recorded at NINJAL in Tachikawa-shi, Japan. The speech data from the corpus was analyzed in order to test the hypothesis that the prosodic variables of mean f<sub>0</sub>, articulation rate, and f<sub>0</sub> range would all be consistently higher in informal speech. Analysis using mixed effects models and a functional data analysis shows that all three prosodic variables are significantly higher in informal speech. These results were then used to inform the design of a speech perception study, which tested how manipulation of mean f<sub>0</sub>, articulation rate, and f<sub>0</sub> range upwards or downwards affect listeners' judgments of de-lexicalized speech as formal or informal. Results show that manipulation of all three variables upwards or downward leads to listeners' judging recordings as more informal or formal respectively. However, manipulation of individual variables does not have a significant correlation with changes in listeners' judgements. This result led to the theory that categorization tasks in speech perception are probabilistic, with listeners accessing distributions of acoustic cues to the categories in order to make judgments. Chapter 5 of the thesis describes a probabilistic Bayesian model of formality formulated based on the theory of the cognitive process of category judgment described in Chapter 4, which attempts to predict a recording's level of formality based only on its prosody. Given information on the overall and speaker-specific distributions of the prosodic cues to the different levels of formality, the model is able to discriminate between categories at a rate better than chance (~63% accurate for formal speech, ~74% accurate for informal speech), performing better than human listeners - who could not predict formality based on only prosodic information at a rate above chance in the study in Chapter 4. The studies in this thesis show a consistent, significant relationship between prosody and formality in spoken Japanese in both speech production and perception, which can be modeled probabilistically using a Bayesian statistical framework.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:757736
Date January 2017
CreatorsSherr-Ziarko, Ethan
ContributorsColeman, John ; Kirby, James ; Lahiri, Aditi
PublisherUniversity of Oxford
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://ora.ox.ac.uk/objects/uuid:85d42ec3-0cba-493d-bb8a-8edfe4f33d44

Page generated in 0.0017 seconds