Global ETD Search

Return to search

Perceptuell Tonhöjd : en undersökning om skillnaden mellan hur människor uppfattar sångröst och frekvensnivån som räknas ut av datoralgoritmer / Perceptual pitch : a thesis regarding the difference between how people perceive the pitch of vocals and the results from pitch estimation algorithms

Det finns idag diverse datorprogram som använder sig av olika typer av datoralgoritmer för att ta fram noter i sång som t.ex. ScoreCloud och Magic Stave. Dock är det inte trivialt för sådana program att approximera tonhöjden i sång. Syftet med denna uppsats är att ta reda på vad som är skillnaden mellan tonhöjden som människor uppfattar och frekvensnivån som räknas ut av datoralgoritmer. För att undersöka detta skapade vi ett program för att kunna utföra ett lyssnartest. Testpersonerna fick lyssna på ett utdrag av sången från låten Tom’s Diner av Suzanne Vega. Samtidigt fick de sedan genom reglage ställa in syntetiska toner så att de matchade tonerna som hördes i sången. Detta test gjordes på tjugo personer som alla hade sjungit eller spelat instrument utan fasta tonhöjdssteg i några år. Efter detta analyserades samma ljudfil med hjälp av datoralgoritmer som använder subharmonisk summering respektive autokorrelation och resultatet från dessa jämfördes med resultat från lyssnartesterna. En slutsats är att människor med musikalisk bakgrund, när de lyssnar på sång, inte lyssnar på varje ton för sig utan istället på helheten, medan de algoritmer som används idag enbart analyserar enskilda toner. Dessa resultat kan vara till nytta vid utveckling och utvärdering av program som inte bara använder resultatet från en algoritm utan också tar hänsyn till hur människor uppfattar sång. / There are several computer programs that use different algorithms for creating sheet music from a singing voice, such as ScoreCloud and Magic Stave. However, it is not trivial to approximate pitch from a sound file containing vocals. The purpose of this thesis was to examine the differences between the pitch of vocals perceived by humans and the results from pitch estimation algorithms. In order to investigate this, a program was created for performing a listening test. Using the program, the subjects listened to an extract of the vocals from the song “Tom’s Diner by Suzanne Vega. At the same time they adjusted sliders which changed the frequency of synthetic tones. The task of the subjects was to make the pitch of these synthetic tones match the pitch of the vocals. These tests were performed by twenty people who had been singing or playing an instrument without fixed pitch steps for a few years. Indications were found that people with musical background, when they listen to a song, do not listen to every note individually but instead how the notes sound together. In contrast, algorithms that are commonly used today look only at individual tones. The results could be useful as a guiding tool for the development and evaluation of this kind of programs in order for them to take into account how people perceive a singing voice, rather than simply using the results of an algorithm.

http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-146100

Media and Communication Technology

Medieteknik

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-146100
Date	January 2014
Creators	Bognandi, Marilia, Wallén, Fredrik
Publisher	KTH, Skolan för datavetenskap och kommunikation (CSC)
Source Sets	DiVA Archive at Upsalla University
Language	Swedish
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0029 seconds

Description

Links & Downloads

Tags

Additional Fields