Global ETD Search

Return to search

Representing Time in Automated Speech Recognition

This thesis explores the treatment of temporal information in Automated Speech Recognition. It reviews the study of time in speech perception and concludes that while some temporal information in the speech signal is of crucial value in the speech decoding process not all temporal information is relevant to decoding. We then review the representation of temporal information in the main automated recognition techniques: Hidden Markov Models and Artificial Neural Networks. We find that both techniques have difficulty representing the type of temporal information that is phonetically or phonologically significant in the speech signal.

In an attempt to improve this situation we explore the problem of representation of temporal information in the acoustic vectors commonly used to encode the speech acoustic signal in the front-ends of speech recognition systems. We attempt, where possible, to let the signal provide the temporal structure rather than imposing a fixed, clock-based timing framework. We develop a novel acoustic temporal parameter (the Parameter Similarity Length), a measure of temporal stability, that is tested against the time derivatives of acoustic parameters conventionally used in acoustic vectors.

http://thesis.anu.edu.au./public/adt-ANU20040602.163031

Automated speech Recognition

Source Synchronous Analysis

Temporal Representation in speech

Identifer	oai:union.ndltd.org:ADTP/216754
Date	January 2003
Creators	Davies, David Richard Llewellyn, dave.davies@canberra.edu.au
Publisher	The Australian National University. Research School of Information Sciences and Engineering
Source Sets	Australiasian Digital Theses Program
Language	English
Detected Language	English
Rights	http://www.anu.edu.au/legal/copyrit.html), Copyright David Richard Llewellyn Davies

Page generated in 0.0016 seconds

Representing Time in Automated Speech Recognition

Description

Links & Downloads

Tags

Additional Fields