Global ETD Search

Return to search

SubRosa: Determining Movie Similarities based on Subtitles

For streaming websites, media shopping platforms and movie databases, movie recommendation systems have become an important technology, where mostly hybrid methods of collaborative
and content-based filtering on the basis of user ratings and user-generated content have proven
to be effective. However, these methods can lead to popularity-biased results that show an underrepresentation of those movies for which only little user-generated data exists. In this paper we will
discuss the possibility of generating movie recommendations that are not based on user-generated data
or metadata, but solely on the content of the movies themselves, confining ourselves to movie dialog.
We extract low-level features from movie subtitles by using methods from Information Retrieval,
Natural Language Processing and Stylometry, and examine a possible correlation of these features’
similarity with the overall movie similarity. In addition we present a novel web application called
SubRosa (http://ch01.informatik.uni-leipzig.de:5001/), which can be used to interactively
compare the results of different feature combinations.

info:eu-repo/classification/ddc/006

ddc:006

info:eu-repo/classification/ddc/770

ddc:770

Identifer	oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:92312
Date	26 June 2024
Creators	Luhmann, Jan, Burghardt, Manuel, Tiepmar, Jochen
Publisher	Gesellschaft für Informatik e.V.
Source Sets	Hochschulschriftenserver (HSSS) der SLUB Dresden
Language	English
Detected Language	English
Type	info:eu-repo/semantics/publishedVersion, doc-type:conferenceObject, info:eu-repo/semantics/conferenceObject, doc-type:Text
Rights	info:eu-repo/semantics/openAccess
Relation	https://doi.org/10.18420/inf2020_119

Page generated in 0.002 seconds

SubRosa: Determining Movie Similarities based on Subtitles

Description

Links & Downloads

Tags

Additional Fields