Global ETD Search

Return to search

Phonene-based topic spotting on the switchboard corpus

Thesis (MScEng)--Stellenbosch University, 2002. / ENGLISH ABSTRACT: The field of topic spotting in conversational speech deals with the problem of identifying
"interesting" conversations or speech extracts contained within large volumes of speech
data. Typical applications where the technology can be found include the surveillance
and screening of messages before referring to human operators. Closely related methods
can also be used for data-mining of multimedia databases, literature searches, language
identification, call routing and message prioritisation.
The first topic spotting systems used words as the most basic units. However, because of the
poor performance of speech recognisers, a large amount of topic-specific hand-transcribed
training data is needed. It is for this reason that researchers started concentrating on methods
using phonemes instead, because the errors then occur on smaller, and therefore less
important, units. Phoneme-based methods consequently make it feasible to use computer
generated transcriptions as training data.
Building on word-based methods, a number of phoneme-based systems have emerged.
The two most promising ones are the Euclidean Nearest Wrong Neighbours (ENWN) algorithm
and the newly developed Stochastic Method for the Automatic Recognition of
Topics (SMART). Previous experiments on the Oregon Graduate Institute of Science and
Technology's Multi-Language Telephone Speech Corpus suggested that SMART yields a
large improvement over ENWN which outperformed competing phoneme-based systems
in evaluations. However, the small amount of data available for these experiments meant
that more rigorous testing was required.
In this research, the algorithms were therefore re-implemented to run on the much larger
Switchboard Corpus. Subsequently, a substantial improvement of SMART over ENWN
was observed, confirming the result that was previously obtained. In addition to this,
an investigation was conducted into the improvement of SMART. This resulted in a new
counting strategy with a corresponding improvement in performance. / AFRIKAANSE OPSOMMING: Die veld van onderwerp-herkenning in spraak het te doen met die probleem om "interessante"
gesprekke of spraaksegmente te identifiseer tussen groot hoeveelhede spraakdata.
Die tegnologie word tipies gebruik om gesprekke te verwerk voor dit verwys word na
menslike operateurs. Verwante metodes kan ook gebruik word vir die ontginning van
data in multimedia databasisse, literatuur-soektogte, taal-herkenning, oproep-kanalisering
en boodskap-prioritisering.
Die eerste onderwerp-herkenners was woordgebaseerd, maar as gevolg van die swak resultate
wat behaal word met spraak-herkenners, is groot hoeveelhede hand-getranskribeerde
data nodig om sulke stelsels af te rig. Dit is om hierdie rede dat navorsers tans foneemgebaseerde
benaderings verkies, aangesien die foute op kleiner, en dus minder belangrike,
eenhede voorkom. Foneemgebaseerde metodes maak dit dus moontlik om rekenaargegenereerde
transkripsies as afrigdata te gebruik.
Verskeie foneemgebaseerde stelsels het verskyn deur voort te bou op woordgebaseerde
metodes. Die twee belowendste stelsels is die "Euclidean Nearest Wrong Neighbours"
(ENWN) algoritme en die nuwe "Stochastic Method for the Automatic Recognition of
Topics" (SMART). Vorige eksperimente op die "Oregon Graduate Institute of Science and
Technology's Multi-Language Telephone Speech Corpus" het daarop gedui dat die SMART
algoritme beter vaar as die ENWN-stelsel wat ander foneemgebaseerde algoritmes geklop
het. Die feit dat daar te min data beskikbaar was tydens die eksperimente het daarop
gedui dat strenger toetse nodig was.
Gedurende hierdie navorsing is die algoritmes dus herimplementeer sodat eksperimente
op die "Switchboard Corpus" uitgevoer kon word. Daar is vervolgens waargeneem dat
SMART aansienlik beter resultate lewer as ENWN en dit het dus die geldigheid van die
vorige resultate bevestig. Ter aanvulling hiervan, is 'n ondersoek geloods om SMART te
probeer verbeter. Dit het tot 'n nuwe telling-strategie gelei met 'n meegaande verbetering
in resultate.

http://hdl.handle.net/10019.1/52998

Automatic speech recognition

Speech processing systems

Pattern recognition systems

Dissertations -- Electronic engineering

Theses -- Electronic engineering

Identifer	oai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:sun/oai:scholar.sun.ac.za:10019.1/52998
Date	04 1900
Creators	Theunissen, M. W. (Marthinus Wilhelmus)
Contributors	Du Preez, J. A., Stellenbosch University. Faculty of Engineering. Dept. of Electrical and Electronic Engineering.
Publisher	Stellenbosch : Stellenbosch University
Source Sets	South African National ETD Portal
Language	en_ZA
Detected Language	English
Type	Thesis
Format	115 p. : ill.
Rights	Stellenbosch University

Page generated in 0.0018 seconds

Phonene-based topic spotting on the switchboard corpus

Description

Links & Downloads

Tags

Additional Fields