Return to search

When words are not enoughAn evaluation of character n-grams and function words in author identification of musical artists

When we write texts we unconsciously leave prints behind, these prints are things such as the words used, punctuation, special characters and more. There are several different approaches to author identification that utilises these features. All these methods have been applied to avariety of texts, everything from papers to poems, e-mail and forum posts. This study will use lyrics where the artists are the authors, on these the performance of two common features will be compared.The two features that will get evaluated are character n-grams and function words. These are some of the most prominent features within author identification, where both have a track record of good performance. With high hopes for the performance the results showed that neither feature could reach the expected results. They were expected to achieve 70% and 65% accuracy respectively, however, the achieved average accuracy was only 40% and 35%. Even with the poor results some interesting finds were made. Some artists would have multiple band members write the songs which caused concern that it would affect the performance. Interestingly the results showed that multiple authors did not bad effect to the performance, in some cases they performed better than single authors.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:umu-156137
Date January 2018
CreatorsNyström, Alexander
PublisherUmeå universitet, Institutionen för datavetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUMNAD ; 1165

Page generated in 0.0016 seconds