Global ETD Search

Return to search

Measuring, refining and calibrating speaker and language information extracted from speech

Thesis (PhD (Electrical and Electronic Engineering))--University of Stellenbosch, 2010. / ENGLISH ABSTRACT: We propose a new methodology, based on proper scoring rules, for the evaluation
of the goodness of pattern recognizers with probabilistic outputs. The
recognizers of interest take an input, known to belong to one of a discrete set
of classes, and output a calibrated likelihood for each class. This is a generalization
of the traditional use of proper scoring rules to evaluate the goodness
of probability distributions. A recognizer with outputs in well-calibrated probability
distribution form can be applied to make cost-effective Bayes decisions
over a range of applications, having di fferent cost functions. A recognizer
with likelihood output can additionally be employed for a wide range of prior
distributions for the to-be-recognized classes.
We use automatic speaker recognition and automatic spoken language
recognition as prototypes of this type of pattern recognizer. The traditional
evaluation methods in these fields, as represented by the series of NIST Speaker
and Language Recognition Evaluations, evaluate hard decisions made by the
recognizers. This makes these recognizers cost-and-prior-dependent. The proposed
methodology generalizes that of the NIST evaluations, allowing for the
evaluation of recognizers which are intended to be usefully applied over a wide
range of applications, having variable priors and costs.
The proposal includes a family of evaluation criteria, where each member
of the family is formed by a proper scoring rule. We emphasize two members
of this family: (i) A non-strict scoring rule, directly representing error-rate
at a given prior. (ii) The strict logarithmic scoring rule which represents
information content, or which equivalently represents summarized error-rate,
or expected cost, over a wide range of applications.
We further show how to form a family of secondary evaluation criteria,
which by contrasting with the primary criteria, form an analysis of the goodness
of calibration of the recognizers likelihoods.
Finally, we show how to use the logarithmic scoring rule as an objective
function for the discriminative training of fusion and calibration of speaker
and language recognizers. / AFRIKAANSE OPSOMMING: Ons wys hoe om die onsekerheid in die uittree van outomatiese
sprekerherkenning- en taalherkenningstelsels voor te stel, te meet, te kalibreer
en te optimeer. Dit maak die bestaande tegnologie akkurater, doeltre ender
en meer algemeen toepasbaar.

http://hdl.handle.net/10019.1/5139

Automatic speaker recognition

Automatic spoken language recognition

Proper scoring rule

Calibration

Dissertations -- Electronic engineering

Theses -- Electronic engineering

Automatic speech recognition

Speech processing systems

Identifer	oai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:sun/oai:scholar.sun.ac.za:10019.1/5139
Date	12 1900
Creators	Brummer, Niko
Contributors	Du Preez, J. A., University of Stellenbosch. Faculty of Engineering. Dept. of Electrical and Electronic Engineering.
Publisher	Stellenbosch : University of Stellenbosch
Source Sets	South African National ETD Portal
Detected Language	English
Type	Thesis
Format	160 p. : ill.
Rights	University of Stellenbosch

Page generated in 0.0024 seconds

Measuring, refining and calibrating speaker and language information extracted from speech

Description

Links & Downloads

Tags

Additional Fields