1 |
Analyzing Networks with Hypergraphs: Detection, Classification, and PredictionAlkulaib, Lulwah Ahmad KH M. 02 April 2024 (has links)
Recent advances in large graph-based models have shown great performance in a variety of tasks, including node classification, link prediction, and influence modeling. However, these graph-based models struggle to capture high-order relations and interactions among entities effectively, leading them to underperform in many real-world scenarios.
This thesis focuses on analyzing networks using hypergraphs for detection, classification, and prediction methods in social media-related problems. In particular, we study four specific applications with four proposed novel methods: detecting topic-specific influential users and tweets via hypergraphs; detecting spatiotemporal, topic-specific, influential users and tweets using hypergraphs; augmenting data in hypergraphs to mitigate class imbalance issues; and introducing a novel hypergraph convolutional network model designed for the multiclass classification of mental health advice in Arabic tweets.
For the first method, existing solutions for influential user detection did not consider topics that could produce incorrect results and inadequate performance in that task.
The proposed contributions of our work include:
1) Developing a hypergraph framework that detects influential users and tweets.
2) Proposing an effective topic modeling method for short texts.
3) Performing extensive experiments to demonstrate the efficacy of our proposed framework.
For the second method, we extend the first method by incorporating spatiotemporal information into our solution. Existing influencer detection methods do not consider spatiotemporal influencers in social media, although influence can be greatly affected by geolocation and time.
The contributions of our work for this task include: 1) Proposing a hypergraph framework that spatiotemporally detects influential users and tweets.
2) Developing an effective topic modeling method for short texts that geographically provides the topic distribution.
3) Designing a spatiotemporal topic-specific influencer user ranking algorithm.
4) Performing extensive experiments to demonstrate the efficacy of our proposed framework.
For the third method, we address the challenge of bot detection on social media platform X, where there's an inherent imbalance between genuine users and bots, a key factor leading to biased classifiers. Our approach leverages the rich structure of hypergraphs to represent X users and their interactions, providing a novel foundation for effective bot detection. The contributions of our work include: 1) Introducing a hypergraph representation of the X platform, where user accounts are nodes and their interactions form hyperedges, capturing the intricate relationships between users.
2) Developing HyperSMOTE to generate synthetic bot accounts within the hypergraph, ensuring a balanced training dataset while preserving the hypergraph's structure and semantics.
3) Designing a hypergraph neural network specifically for bot detection, utilizing node and hyperedge information for accurate classification.
4) Conducting comprehensive experiments to validate the effectiveness of our methods, particularly in scenarios with pronounced class imbalances.
For the fourth method, we introduce a Hypergraph Convolutional Network model for classifying mental health advice in Arabic tweets. Our model distinguishes between valid and misleading advice, leveraging high-order word relations in short texts through hypergraph structures. Our extensive experiments demonstrate its effectiveness over existing methods. The key contributions of our work include:
1) Developing a hypergraph-based model for short text multiclass classification, capturing complex word relationships through hypergraph convolution.
2) Defining four types of hyperedges to encapsulate local and global contexts and semantic similarities in our dataset.
3) Conducting comprehensive experiments in which the proposed model outperforms several baseline models in classifying Arabic tweets, demonstrating its superiority.
For the fifth method, we extended our previous Hypergraph Convolutional Network (HCN) model to be tailored for sarcasm detection across multiple low-resource languages. Our model excels in interpreting the subtle and context-dependent nature of sarcasm in short texts by exploiting the power of hypergraph structures to capture complex, high-order relationships among words. Through the construction of three hyperedge types, our model navigates the intricate semantic and sentiment differences that characterize sarcastic expressions. The key contributions of our research are as follows:
1) A hypergraph-based model was adapted for the task of sarcasm detection in five short low-resource language texts, allowing the model to capture semantic relationships and contextual cues through advanced hypergraph convolution techniques.
2) Introducing a comprehensive framework for constructing hyperedges, incorporating short text, semantic similarity, and sentiment discrepancy hyperedges, which together enrich the model's ability to understand and detect sarcasm across diverse linguistic contexts.
3) The extensive evaluations reveal that the proposed hypergraph model significantly outperforms a range of established baseline methods in the domain of multilingual sarcasm detection, establishing new benchmarks for accuracy and generalizability in detecting sarcasm within low-resource languages. / Doctor of Philosophy / In the digital era, social media platforms are not just tools for communication but vast networks where billions of messages, opinions, and pieces of advice are exchanged every day. Navigating through this massive data to identify influential content, detect misleading information, or understand subtle expressions like sarcasm presents a significant challenge. Traditional methods often struggle to grasp the complex relationships and nuances embedded within the data. This dissertation introduces innovative approaches using hypergraphs—a type of network representation that captures complex interactions more effectively than traditional network models.
The research presented explores six distinct applications of hypergraphs in social media analysis, each addressing a unique challenge:
1) The identification of influential users and content specific to certain topics, extending beyond general influence to understand context-driven impact.
2) The incorporation of time and location to detect influential content, recognizing that relevance can significantly vary by these factors.
3) Addressing the issue of imbalanced data in bot detection, where genuine user interactions are overwhelmed by automated accounts, through novel data augmentation techniques.
4) Classifying mental health advice in Arabic tweets to differentiate between valid and misleading information is crucial, given the subject's sensitivity.
5) Detecting sarcasm in low-resource languages is particularly challenging due to its subtle and context-dependent nature.
6) Predicting metro passenger ridership at each metro station is challenging due to the constantly evolving nature of the network and passengers going in and out of stations.
This work contributes to the field by demonstrating the capability of hypergraphs to provide more fine-grained and context-aware analyses of social media content. Through extensive experimentation, it showcases the effectiveness of these methods in improving detection, classification, and prediction tasks. The findings not only advance our technical understanding and capabilities in social media analysis but also have practical implications for enhancing the reliability and usefulness of information disseminated on these platforms.
|
2 |
Next Generation of Product Search and DiscoveryZeng, Kaiman 12 November 2015 (has links)
Online shopping has become an important part of people’s daily life with the rapid development of e-commerce. In some domains such as books, electronics, and CD/DVDs, online shopping has surpassed or even replaced the traditional shopping method. Compared with traditional retailing, e-commerce is information intensive. One of the key factors to succeed in e-business is how to facilitate the consumers’ approaches to discover a product. Conventionally a product search engine based on a keyword search or category browser is provided to help users find the product information they need. The general goal of a product search system is to enable users to quickly locate information of interest and to minimize users’ efforts in search and navigation. In this process human factors play a significant role. Finding product information could be a tricky task and may require an intelligent use of search engines, and a non-trivial navigation of multilayer categories. Searching for useful product information can be frustrating for many users, especially those inexperienced users.
This dissertation focuses on developing a new visual product search system that effectively extracts the properties of unstructured products, and presents the possible items of attraction to users so that the users can quickly locate the ones they would be most likely interested in. We designed and developed a feature extraction algorithm that retains product color and local pattern features, and the experimental evaluation on the benchmark dataset demonstrated that it is robust against common geometric and photometric visual distortions. Besides, instead of ignoring product text information, we investigated and developed a ranking model learned via a unified probabilistic hypergraph that is capable of capturing correlations among product visual content and textual content. Moreover, we proposed and designed a fuzzy hierarchical co-clustering algorithm for the collaborative filtering product recommendation. Via this method, users can be automatically grouped into different interest communities based on their behaviors. Then, a customized recommendation can be performed according to these implicitly detected relations. In summary, the developed search system performs much better in a visual unstructured product search when compared with state-of-art approaches. With the comprehensive ranking scheme and the collaborative filtering recommendation module, the user’s overhead in locating the information of value is reduced, and the user’s experience of seeking for useful product information is optimized.
|
Page generated in 0.0719 seconds