Global ETD Search

Return to search

Development of a text-independent automatic speaker recognition system

Thesis (M. Sc. (Computer Science)) -- University of Limpopo, 2021 / The task of automatic speaker recognition, wherein a system verifies or identifies
speakers from a recording of their voices, has been researched for several decades.
However, research in this area has been carried out largely on freely accessible
speaker datasets built on languages that are well-resourced like English. This study
undertakes automatic speaker recognition research focused on a low-resourced
language, Sepedi. As one of the 11 official languages in South Africa, Sepedi is
spoken by at least 2.8 million people. Pre-recorded voices were acquired from a
speech and language national repository, namely, the National Centre for Human
Language Technology (NCHLT), were we selected the Sepedi NCHLT Speech
Corpus. The open-source pyAudioAnalysis python library was used to extract three
types of acoustic features of speech namely, time, frequency and cepstral domain
features, from the acquired speech data. The effects and compatibility of these
acoustic features was investigated. It was observed that combining the three acoustic
features of speech had a more significant effect than using individual features as far
as speaker recognition accuracy is concerned. The study also investigated the
performance of machine learning algorithms on low-resourced languages such as
Sepedi. Five machine learning (ML) algorithms implemented on Scikit-learn namely,
K-nearest neighbours (KNN), support vector machines (SVM), random forest (RF),
logistic regression (LR), and multi-layer perceptrons (MLP) were used to train different
classifier models. The GridSearchCV algorithm, also implemented on Scikit-learn, was
used to deduce ideal hyper-parameters for each of the five ML algorithms. The
classifier models were evaluated on recognition accuracy and the results show that
the MLP classifier, with a recognition accuracy of 98%, outperforms KNN, RF, LR and
SVM classifiers. A graphical user interface (GUI) is developed and the best performing
classifier model, MLP, is deployed on the developed GUI intended to be used for real time speaker identification and verification tasks. Participants were recruited to the
GUI performance and acceptable results were obtained

http://hdl.handle.net/10386/3829

Automatic speaker recognition

Recording of voices

Graphical user interface

Automatic speech recognition

Speech processing systems

Icons (Computer graphics)

Identifer	oai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:ul/oai:ulspace.ul.ac.za:10386/3829
Date	January 2021
Creators	Mokgonyane, Tumisho Billson
Contributors	Manamela, M. J. D., Modipa, T. I.
Source Sets	South African National ETD Portal
Language	English
Detected Language	English
Type	Thesis
Format	xiii, 60 leaves
Relation	PDF

Page generated in 0.002 seconds

Development of a text-independent automatic speaker recognition system

Description

Links & Downloads

Tags

Additional Fields