Zhu, Yi. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2011. / Includes bibliographical references (p. 136-148). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.vi / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview --- p.1 / Chapter 1.2 --- Motivations of Detecting Commercial Intention --- p.4 / Chapter 1.3 --- Problem Definition for Commercial Intention Detection --- p.6 / Chapter 1.4 --- Contributions --- p.8 / Chapter 1.5 --- Thesis Organization --- p.9 / Chapter 2 --- Literature Review --- p.12 / Chapter 2.1 --- Twitter and Tweets Analysis --- p.13 / Chapter 2.2 --- Intention Detection --- p.17 / Chapter 2.2.1 --- User Intention Mining --- p.17 / Chapter 2.2.2 --- Commercial Intention Mining --- p.18 / Chapter 2.3 --- Similar Task: Opinion Mining --- p.18 / Chapter 2.4 --- NLP Techniques for Commercial Intention Detection --- p.20 / Chapter 2.4.1 --- Words Semantic Similarity --- p.21 / Chapter 2.4.2 --- Short Text Similarity --- p.25 / Chapter 2.5 --- Hierarchical Classification --- p.26 / Chapter 2.5.1 --- Hierarchical Classifiers Overview --- p.26 / Chapter 2.5.2 --- Construction of Hierarchy --- p.27 / Chapter 2.5.3 --- Taxonomy of Hierarchical Classification --- p.28 / Chapter 3 --- System Overview --- p.31 / Chapter 3.1 --- Feasibility of Commercial Intention Detection --- p.31 / Chapter 3.2 --- System Design and Architecture --- p.33 / Chapter 3.3 --- Components of READ-MIND --- p.35 / Chapter 3.3.1 --- Preprocessing --- p.35 / Chapter 3.3.2 --- Centroid Word Locator --- p.37 / Chapter 3.3.3 --- Commercial Intention Detector --- p.38 / Chapter 3.3.4 --- Tweet Classifier --- p.40 / Chapter 3.3.5 --- Advertisement Mapping --- p.41 / Chapter 3.4 --- System Work Flow --- p.42 / Chapter 3.4.1 --- System Dataflow and Controlflow --- p.42 / Chapter 3.4.2 --- User Interface --- p.42 / Chapter 3.5 --- System Speed Up --- p.43 / Chapter 3.6 --- Summary --- p.45 / Chapter 4 --- Natural Language Processing on Tweets --- p.46 / Chapter 4.1 --- NLP Techniques in READ-MIND --- p.46 / Chapter 4.2 --- Centroid Word Locator --- p.47 / Chapter 4.2.1 --- Centroid Word --- p.47 / Chapter 4.2.2 --- Locating Centroid Word --- p.48 / Chapter 4.2.3 --- Centroid Word Pair --- p.50 / Chapter 4.2.4 --- Locating Centroid Word Pair --- p.54 / Chapter 4.3 --- Semantic Relatedness Between Tweets --- p.59 / Chapter 4.3.1 --- Relatedness with a Words Set --- p.60 / Chapter 4.3.2 --- Relatedness between Tweets --- p.62 / Chapter 4.3.3 --- Words Similarity --- p.63 / Chapter 4.4 --- Summary --- p.65 / Chapter 5 --- Tweets Classification --- p.66 / Chapter 5.1 --- Two Stages of Tweets Classification --- p.66 / Chapter 5.2 --- Commercial Intention Detector --- p.68 / Chapter 5.2.1 --- Intuitive Method --- p.68 / Chapter 5.2.2 --- Binary Classification --- p.70 / Chapter 5.3 --- Tweet Categorization --- p.72 / Chapter 5.3.1 --- Build Hierarchical Classifier --- p.73 / Chapter 5.3.2 --- Hierarchical Classification --- p.81 / Chapter 5.4 --- Summary --- p.83 / Chapter 6 --- Empirical Study --- p.84 / Chapter 6.1 --- Objective of Empirical Study --- p.84 / Chapter 6.2 --- Experiment Setup and Evaluation Methodology --- p.85 / Chapter 6.2.1 --- Simulation Environment --- p.85 / Chapter 6.2.2 --- Tweets Data Set --- p.86 / Chapter 6.2.3 --- Labeling Process --- p.87 / Chapter 6.2.4 --- Evaluation Methodology --- p.88 / Chapter 6.3 --- Compare Algorithms in Components --- p.90 / Chapter 6.3.1 --- Centroid Word VS. Centroid Word Pair --- p.91 / Chapter 6.3.2 --- Semantic Similarity Comparison --- p.92 / Chapter 6.3.3 --- Methods in Commercial Intention Detector --- p.93 / Chapter 6.3.4 --- Structure of Hierarchy --- p.94 / Chapter 6.3.5 --- Training Source of Tweets Classifier --- p.95 / Chapter 6.3.6 --- Summary --- p.96 / Chapter 6.4 --- Parameter Settings Comparison --- p.97 / Chapter 6.4.1 --- Impact of Varying Parameters --- p.97 / Chapter 6.4.2 --- Discussion on Parameter Setting --- p.98 / Chapter 6.5 --- Comparison of READ-MIND and Baseline Method --- p.100 / Chapter 6.6 --- Time Cost Analysis --- p.101 / Chapter 6.6.1 --- Time Cost to Process Tweets --- p.101 / Chapter 6.6.2 --- Comparison with Baseline --- p.102 / Chapter 6.6.3 --- Analysis on Real-Time Property --- p.103 / Chapter 6.7 --- TCI Categories Comparison --- p.106 / Chapter 6.7.1 --- Results for Different TCIs --- p.106 / Chapter 6.7.2 --- Comparison of Different TCIs --- p.107 / Chapter 6.8 --- Summary --- p.108 / Chapter 7 --- Conclusion --- p.109 / Chapter 7.1 --- Conclusion --- p.109 / Chapter 7.2 --- Future Work --- p.111 / Chapter A --- List of Abbreviations --- p.112 / Chapter B --- List of Symbols --- p.114 / Chapter C --- Proof --- p.117 / Chapter D --- System Work Flow --- p.120 / Chapter E --- Algorithms --- p.123 / Chapter F --- Detailed Experimental Results --- p.129 / Bibliography --- p.136
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_327393 |
Date | January 2011 |
Contributors | Zhu, Yi., Chinese University of Hong Kong Graduate School. Division of Computer Science and Engineering. |
Source Sets | The Chinese University of Hong Kong |
Language | English, Chinese |
Detected Language | English |
Type | Text, bibliography |
Format | print, xv, 148 p. : ill. ; 30 cm. |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.002 seconds