For streaming websites, media shopping platforms and movie databases, movie recommendation systems have become an important technology, where mostly hybrid methods of collaborative
and content-based filtering on the basis of user ratings and user-generated content have proven
to be effective. However, these methods can lead to popularity-biased results that show an underrepresentation of those movies for which only little user-generated data exists. In this paper we will
discuss the possibility of generating movie recommendations that are not based on user-generated data
or metadata, but solely on the content of the movies themselves, confining ourselves to movie dialog.
We extract low-level features from movie subtitles by using methods from Information Retrieval,
Natural Language Processing and Stylometry, and examine a possible correlation of these features’
similarity with the overall movie similarity. In addition we present a novel web application called
SubRosa (http://ch01.informatik.uni-leipzig.de:5001/), which can be used to interactively
compare the results of different feature combinations.
Identifer | oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:92312 |
Date | 26 June 2024 |
Creators | Luhmann, Jan, Burghardt, Manuel, Tiepmar, Jochen |
Publisher | Gesellschaft für Informatik e.V. |
Source Sets | Hochschulschriftenserver (HSSS) der SLUB Dresden |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/publishedVersion, doc-type:conferenceObject, info:eu-repo/semantics/conferenceObject, doc-type:Text |
Rights | info:eu-repo/semantics/openAccess |
Relation | https://doi.org/10.18420/inf2020_119 |
Page generated in 0.0018 seconds