Return to search

Learning with social media. / 基於社會化媒體的學習 / CUHK electronic theses & dissertations collection / Ji yu she hui hua mei ti de xue xi

隨著Web 2.0系統在過去十年的迅猛發展,社會化媒體,比如社會化評分系統、社會化標籤系統、在線論壇和社會化問答系統,已經革命性地改變了人們在互聯網上創造和分享內容的方式。但是,面對社會化媒體數據的飛速增長,用戶面臨嚴重的信息過載的問題。現在,基於社會化媒體學習的社會化計算,已經發展成爲了幫助社會化媒體用戶有效解決信息需求的一個重要的研究領域。一般來說,用戶在社會化媒體中發佈信息,期望通過社會化計算尋找到合適的項目。爲了更好地理解用戶的興趣,分析不同類型的用戶產生數據是非常重要的。另一方面,返回給用戶的可以是項目,或是擁有相似興趣的其他用戶。除了基於用戶的分析,進行基於項目的分析也是非常有趣和重要的,比如理解項目的屬性,將語義相關的項目聚在一起爲了更好地滿足用戶的信息需求等。 / 本論文的目地是提出自動化和可擴展的模型來幫助社會化媒體用戶更有效的解決信息需求。這些模型基於社會化媒體中兩個重要的組成提出:用戶和項目。因此,基於以下兩個目標,我們提出一個統一的框架來整合用戶信息和項目信息:1) 通過用戶的行為找出用戶的興趣,并為之推薦可能感興趣的項目和相似興趣的用戶;2) 理解項目的屬性,並將語義相關的項目聚合在一起從而能更好的滿足用戶的信息需求。 / 爲了完成第一個目標,我們提出了一個新的矩陣分解的框架來整合不同的用戶行為數據,從而預測用戶對新項目的興趣。這個框架有效地解決了數據稀疏性以及傳統方法中信息來源單一的問題,其次,爲了給社會化媒體用戶提供自動發現類似興趣的其他用戶的方式,通過利用社會化標籤信息,我們提出了基於用戶興趣挖掘和基於興趣的用戶推薦的框架。大量的真實數據實驗驗證了提出的基於用戶的模型的有效性。 / 爲了完成第二個目標,我們在具問答性質的社會化媒體中提出了問題推薦的應用。問題推薦的目標是基於一個用戶問題推薦語義相關的問題。傳統的詞袋模型不能有效地解決相關問題中用詞不同的問題。因此,我們提出了兩個模型來結合詞法分析以及潛在語義分析,從而有效地衡量問題間的語義相關度。在問題分析中,當前研究缺少對問題屬性的認識。爲了解決這個問題,我們提出了一個有監督學習的方法來識別問題的主觀性。具體來說,我們提出了一種基於社會化信號的無人工參與的自動收集訓練數據的方法。大量實驗證實了提出的方法的效果超過了之前的其他算法。 / 概括起來,圍繞社會化媒體中兩個重要的組成,我們提出了兩個基於用戶的模型和兩個基於項目的模型來幫助社會化媒體的用戶更準確更有效地解決信息需求。我們通過不同社會化媒體中的大量實驗證實了提出模型的有效性。 / With the astronomical growth of Web 2.0 over the past decade, social media systems, such as rating systems, social tagging systems, online forums, and community-based question answering (Q&A) systems, have revolutionized people’s way of creating and sharing contents on the Web. However, due to the explosive growth of data in social media systems, users are drowning in information and encountering information overload problem. Currently, social computing techniques, achieved through learning with social media, have emerged as an important research area to help social media users find their information needs. In general, users post contents which reflect their interests in social media systems, and expect to obtain the suitable items through social computing techniques. To better understand users’ interests, it is very essential to analyze different types of user generate content. On the other hand, the returned information may be items, or users with similar interests. Beyond the user-based analysis, it would be quite interesting and important to conduct item-oriented study, such as understand items’ characteristics, and grouping items that are semantically related for better addressing users’ information needs. / The objective of this thesis is to establish automatic and scalable models to help social media users find their information needs more effectively. These models are proposed based on the two key entities in social media systems: user and item. Thus, one important aspect of this thesis is therefore to develop a framework to combine the user information and the item information with the following two purposes: 1) modeling users’ interests with respect to their behavior, and recommending items or users they may be interested in; and 2) understanding items’ characteristics, and grouping items that are semantically related for better addressing users’ information needs. / For the first purpose, a novel unified matrix factorization framework which fuses different types of users’ behavior data, is proposed for predicting users’ interests on new items. The framework tackles the data sparsity problem and non-flexibility problem confronted by traditional algorithms. Furthermore, to provide users with an automatic and effective way to discover other users with common interests, we propose a framework for user interest modeling and interest-based user recommendation by utilizing users’ tagging information. Extensive evaluations on real world data demonstrate the effectiveness of the proposed user-based models. / For the second purpose, a new functionality question suggestion, which targets at suggesting questions that are semantically related to a queried question, is proposed in social media systems with Q&A functionalities. Existing bag-of-words approaches suffer from the shortcoming that they could not bridge the lexical chasm between semantically related questions. Therefore, we present two models which combines both the lexical and latent semantic knowledge to measure the semantic relatedness among questions. In question analysis, there is a lack of understanding of questions’ characteristics. To tackle this problem, a supervised approach is developed to identify questions’ subjectivity. Moreover, we come up with an approach to collect training data automatically by utilizing social signals without involving any manual labeling. The experimental results show that our methods perform better than the state-of-theart approaches. / In summary, based on the two key entities in social media systems,we present two user-based models and two item-oriented models to help social media users find their information needs more accurately and effectively through learning with social media. Extensive experiments on various social media systems confirm the effectiveness of proposed models. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Zhou, Chao. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2013. / Includes bibliographical references (leaves 130-163). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts also in Chinese. / Abstract --- p.i / Acknowledgement --- p.vi / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview --- p.1 / Chapter 1.2 --- Thesis Contribution --- p.9 / Chapter 1.3 --- Thesis Organization --- p.11 / Chapter 2 --- Background Review --- p.15 / Chapter 2.1 --- Recommender System Techniques --- p.15 / Chapter 2.1.1 --- Content-based Filtering --- p.16 / Chapter 2.1.2 --- Collaborative Filtering --- p.17 / Chapter 2.2 --- Information Retrieval Models --- p.23 / Chapter 2.2.1 --- Vector Space Model --- p.24 / Chapter 2.2.2 --- Probabilistic Model and Language Model --- p.27 / Chapter 2.2.3 --- Translation Model --- p.31 / Chapter 2.3 --- Machine Learning --- p.32 / Chapter 2.3.1 --- Supervised Learning --- p.32 / Chapter 2.3.2 --- Semi-Supervised Learning --- p.34 / Chapter 2.3.3 --- Unsupervised Learning --- p.36 / Chapter 2.4 --- Rating Prediction --- p.37 / Chapter 2.5 --- User Recommendation --- p.39 / Chapter 2.6 --- Automatic Question Answering --- p.40 / Chapter 2.6.1 --- Automatic Question Answering (Q&A) from theWeb --- p.40 / Chapter 2.6.2 --- Proliferation of community-based Q&A services and online forums --- p.41 / Chapter 2.6.3 --- Automatic Question Answering (Q&A) in social media --- p.42 / Chapter 3 --- Item Recommendation with Tagging Ensemble --- p.44 / Chapter 3.1 --- Problem and Motivation --- p.44 / Chapter 3.2 --- TagRec Framework --- p.45 / Chapter 3.2.1 --- Preliminaries --- p.45 / Chapter 3.2.2 --- User-Item Rating Matrix Factorization --- p.45 / Chapter 3.2.3 --- User-Tag Tagging Matrix Factorization --- p.47 / Chapter 3.2.4 --- Item-Tag Tagging Matrix Factorization --- p.49 / Chapter 3.2.5 --- A Unified Matrix Factorization for TagRec --- p.50 / Chapter 3.2.6 --- Complexity Analysis --- p.53 / Chapter 3.3 --- Experimental Analysis --- p.54 / Chapter 3.3.1 --- Description of Data Set and Metrics --- p.54 / Chapter 3.3.2 --- Performance Comparison --- p.55 / Chapter 3.3.3 --- Impact of Parameters and --- p.56 / Chapter 3.4 --- Summary --- p.58 / Chapter 4 --- User Recommendation via Interest Modeling --- p.60 / Chapter 4.1 --- Problem and Motivation --- p.60 / Chapter 4.2 --- UserRec Framework --- p.61 / Chapter 4.2.1 --- User Interest Modeling --- p.61 / Chapter 4.2.2 --- Interest-based User Recommendation --- p.65 / Chapter 4.3 --- Experimental Analysis --- p.67 / Chapter 4.3.1 --- Dataset Description and Analysis --- p.67 / Chapter 4.3.2 --- Experimental Results --- p.70 / Chapter 4.4 --- Summary --- p.75 / Chapter 5 --- Item Suggestion with Semantic Analysis --- p.76 / Chapter 5.1 --- Problem and Motivation --- p.76 / Chapter 5.2 --- Question Suggestion Framework --- p.77 / Chapter 5.2.1 --- Question Suggestion in Online Forums --- p.77 / Chapter 5.2.2 --- Question Suggestion in Community-based Q&A Services --- p.84 / Chapter 5.3 --- Experiments And Results --- p.88 / Chapter 5.3.1 --- Experiments in Online Forums --- p.88 / Chapter 5.3.2 --- Experiments in Community-based Q&A Services --- p.96 / Chapter 5.4 --- Summary --- p.105 / Chapter 6 --- Item Modeling via Data-Driven Approach --- p.106 / Chapter 6.1 --- Problem and Motivation --- p.106 / Chapter 6.2 --- Question Subjectivity Identification --- p.107 / Chapter 6.2.1 --- Social Signal Investigation --- p.107 / Chapter 6.2.2 --- Feature Investigation --- p.110 / Chapter 6.3 --- Experimental Evaluation --- p.112 / Chapter 6.3.1 --- Experimental Setting --- p.112 / Chapter 6.3.2 --- Effectiveness of Social Signals --- p.114 / Chapter 6.3.3 --- Effectiveness of Heuristic Features --- p.116 / Chapter 6.4 --- Summary --- p.122 / Chapter 7 --- Conclusion --- p.123 / Chapter 7.1 --- Summary --- p.123 / Chapter 7.2 --- Future Work --- p.124 / Chapter A --- List of Publications --- p.126 / Chapter A.1 --- Conference Publications --- p.126 / Chapter A.2 --- Journal Publications --- p.127 / Chapter A.3 --- Under Review --- p.128 / Bibliography --- p.129

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_328164
Date January 2013
ContributorsZhou, Chao, Chinese University of Hong Kong Graduate School. Division of Computer Science and Engineering.
Source SetsThe Chinese University of Hong Kong
LanguageEnglish, Chinese
Detected LanguageEnglish
TypeText, bibliography
Formatelectronic resource, electronic resource, remote, 1 online resource (xvi, 163 leaves) : ill. (some col.)
RightsUse of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0032 seconds