Global ETD Search

Return to search

Googles röstgränssnitts lämplighet för användning i en röstbaserad medicinteknisk tjänst / The Suitability of Google Speech API for Use in a Voice-Based Medical Device Service

I detta projekt har Googles röstgränssnitt (eng: Google Cloud Speech API) utvärderats utifrån syftet att skapa ett program som ska identifiera en person baserat på dess röst. Detta projekt gjordes tillsammans med ett företager Call Knut vars mål är att utforma en tjänst som bygger på AI teknik som ska ringa upp till äldre. Eftersom tjänsten riktar sig mot äldre vill företaget Call Knut ha ett program som kan identifiera de äldre baserat på rösten. Ett program skapades med hjälp av Googles röstgränssnitt för att transkribera och urskilja två röster i en ljudfil. Därefter samlades det in ljudfiler från olika personer i ett brett åldersspann och ljudfilerna kombinerades. De kombinerade ljudfilerna analyserades sedan för att kunna verifiera om Googles röstgränssnitt är optimalt för ändamålet. I 29,2 % av de kombinerade ljudfilerna lyckades Googles röstgränssnitt med att både urskilja och transkribera. Totalt misslyckades Googles röstgränssnitt med 70,8 % av inmatningarna. Vår slutsats blev att Googles röstgränssnitt inte är lämpligt att använda för att utveckla Call Knuts planerade tjänst där rösturskiljningen måste fungera med hög precision. Vidare utvecklingsarbete rekommenderas att fokusera på att testa andra program eller röstgränssnitt. / In this project, the Google Speech API has been evaluated based on the purpose of creating a program that will identify a person based on their voice. This project is done together with a company called Call Knut whose goal is to design a service based on AI technology that will call the elderly. Since the service is aimed at the elderly, Call Knut wants a program that can identify the elderly based on their voice. An application was created using the Google Speech API to transcribe and distinguish two voices in an audio file. Then audio files were collected from different people in a wide age range and audio files were combined. The combined audio files were then analyzed to verify whether the Google Cloud interface is optimal for the purpose. In 29.2 % of the combined audio files Google Speech API managed to both distinguish two voices and transcribe what they said. In total, Google Speech API failed with 70.8 % of the entries. Our conclusion was that Google's voice interface is not suitable to use to develop Call Knut’s planned service where voice recognition must work with high precision. Further development work is recommended to focus on testing other programs or voice interfaces.

http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-316593

Googles röstgränssnitt

Other Medical Engineering

Annan medicinteknik

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-316593
Date	January 2022
Creators	Eivinsson, Tova, Saleh, Mariam
Publisher	KTH, Medicinteknik och hälsosystem
Source Sets	DiVA Archive at Upsalla University
Language	Swedish
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess
Relation	TRITA-CBH-GRU ; 2022:158

Page generated in 0.0024 seconds

Googles röstgränssnitts lämplighet för användning i en röstbaserad medicinteknisk tjänst / The Suitability of Google Speech API for Use in a Voice-Based Medical Device Service

Description

Links & Downloads

Tags

Additional Fields