Social media has become an integral part of the Internet. There have been users across the world sharing content like images, texts, videos, and so on. There is a huge amount of data being generated and it has become a challenge to the social media platforms to group the content for further usage like recommending a video. Especially, grouping videos based on similarity requires extracting features. This thesis investigates potential approaches to extract features that can help in determining the similarity between videos. Features of given videos are extracted using Object Detection and Action Recognition. Bag-of-features representation is used to build the vocabulary of all the features and transform data that can be useful in clustering videos. Probabilistic model-based clustering, Multinomial Mixture model is used to determine the underlying clusters within the data by maximizing the expected log-likelihood and estimating the parameters of data as well as probabilities of clusters. Analysis of clusters is done to understand the genre based on dominant actions and objects. Bayesian Information Criterion(BIC) and Akaike Information Criterion(AIC) are used to determine the optimal number of clusters within the given videos. AIC/BIC scores achieved minimum scores at 32 clusters which are chosen to be the optimal number of clusters. The data is labeled with the genres and Logistic regression is performed to check the cluster performance on test data and has achieved 96% accuracy
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-176942 |
Date | January 2021 |
Creators | Vellala, Abhinay |
Publisher | Linköpings universitet, Statistik och maskininlärning |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0025 seconds