Global ETD Search

Return to search

Automatic Speech Recognition System for Somali in the interest of reducing Maternal Morbidity and Mortality.

Developing an Automatic Speech Recognition (ASR) system for the Somali language, though not novel, is not actively explored; hence there has been no success in a model for conversational speech. Neither are related works accessible as open-source. The unavailability of digital data is what labels Somali as a low resource language and poses the greatest impediment to the development of an ASR for Somali. The incentive to develop an ASR system for the Somali language is to contribute to reducing the Maternal Mortality Rate (MMR) in Somalia. Researchers acquire interview audio data regarding maternal health and behaviour in the Somali language; to be able to engage the relevant stakeholders to bring about the needed change, these audios must be transcribed into text, which is an important step towards translation into any language. This work investigates available ASR for Somali and attempts to develop a prototype ASR system to convert Somali audios into Somali text. To achieve this target, we first identified the available open-source systems for speech recognition and selected the DeepSpeech engine for the implementation of the prototype. With three hours of audio data, the accuracy of transcription is not as required and cannot be deployed for use. This we attribute to insufficient training data and estimate that the effort towards an ASR for Somali will be more significant by acquiring about 1200 hours of audio to train the DeepSpeech engine

http://urn.kb.se/resolve?urn=urn:nbn:se:du-34436

Automatic Speech Recognition (ASR)

DeepSpeech

Natural Language Processing (NLP)

Word Error Rate (WER)

Character Error Rate (CER)

Social Sciences

Samhällsvetenskap

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:du-34436
Date	January 2020
Creators	Laryea, Joycelyn, Jayasundara, Nipunika
Publisher	Högskolan Dalarna, Mikrodataanalys, Högskolan Dalarna, Mikrodataanalys
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0021 seconds

Automatic Speech Recognition System for Somali in the interest of reducing Maternal Morbidity and Mortality.

Description

Links & Downloads

Tags

Additional Fields