Reddit (www.reddit.com) is a social news platform for information sharing and exchanging. The amount of data, in terms of both observations and dimensions is enormous because a large number of users express all aspects of knowledge in their own lives by publishing the comments. While it’s easy for a human being to understand the Reddit comments on an individual basis, it is a tremendous challenge to use a computer and extract insights from it. In this thesis, we seek one algorithmic driven approach to analyze both the unique Reddit data structure and the relations inside owners of comments by their similar features. We explore the various types of communications between two people with common characteristics and build a special communication model that characterizes the potential relationship between two users via their communication messages. We then seek a dimensionality reduction methodology that can merge users with similar behavior into same groups. Along the process, we develop computer program to collect data, define attributes based on the communication model and apply a rule-based group merging algorithm. We then evaluate the results to show the effectiveness of this methodology. Our results show reasonable success in producing user groups that have recognizable group characteristics and share similar intentions.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:lnu-56500 |
Date | January 2016 |
Creators | Sun, Xuebo, Wang, Yudan |
Publisher | Linnéuniversitetet, Institutionen för datavetenskap (DV), Linnéuniversitetet, Institutionen för datavetenskap (DV) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0017 seconds