Global ETD Search

Return to search

Language Classification Using Neural Networks

In this project a model has been created that with an audio sequence as input can classify the language being spoken to be either english or french. The focus of the project has been to experiment with different ways to process audio files and to design a neural network in order to maximize the performance for the task of language classification. The purpose of the project was to investigate the highest reachable accuracy and to examine what sample length that would be appropriate in order to be useful in voice control application. The signal processing part dealt mainly with how enveloping, Mel frequency ceptral coefficients (MFCC) and Mel frequency spectral coefficients (MFSC) could be used to enhance the accuracy of the model. The neural network design focused on how the width and depth of the network and the use of dropouts could be used to increase the performance. The experiments resulted in a model with a maximum accuracy of 92,30 % that could outperform humans for samples of approximately 1,2 seconds of shorter. A suitable sample length to be usable in other applications was concluded to be in the interval of 0,7 to 1,5 seconds.

http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-385537

Övrig annan teknik

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-385537
Date	January 2019
Creators	Lindgren, Andreas, Lind, Gustav
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess
Relation	TVE-F ; 19016

Page generated in 0.0023 seconds

Language Classification Using Neural Networks

Description

Links & Downloads

Tags

Additional Fields