Ip Chun Wah Timmy. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 113-120). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Topic Detection and Tracking --- p.2 / Chapter 1.1.1 --- What is a Topic? --- p.3 / Chapter 1.1.2 --- What is Topic Tracking? --- p.4 / Chapter 1.2 --- Research Contributions --- p.4 / Chapter 1.2.1 --- Named Entity Tagging --- p.5 / Chapter 1.2.2 --- Handling Unknown Words --- p.6 / Chapter 1.2.3 --- Named-Entity Approach in Topic Tracking --- p.7 / Chapter 1.3 --- Organization of Thesis --- p.7 / Chapter 2 --- Background --- p.9 / Chapter 2.1 --- Previous Developments in Topic Tracking --- p.10 / Chapter 2.1.1 --- BBN's Tracking System --- p.10 / Chapter 2.1.2 --- CMU's Tracking System --- p.11 / Chapter 2.1.3 --- Dragon's Tracking System --- p.12 / Chapter 2.1.4 --- UPenn's Tracking System --- p.13 / Chapter 2.2 --- Topic Tracking in Chinese --- p.13 / Chapter 2.3 --- Part-of-Speech Tagging --- p.15 / Chapter 2.3.1 --- A Brief Overview of POS Tagging --- p.15 / Chapter 2.3.2 --- Transformation-based Error-Driven Learning --- p.18 / Chapter 2.4 --- Unknown Word Identification --- p.20 / Chapter 2.4.1 --- Rule-based approaches --- p.21 / Chapter 2.4.2 --- Statistical approaches --- p.23 / Chapter 2.4.3 --- Hybrid approaches --- p.24 / Chapter 2.5 --- Information Retrieval Models --- p.25 / Chapter 2.5.1 --- Vector-Space Model --- p.26 / Chapter 2.5.2 --- Probabilistic Model --- p.27 / Chapter 2.6 --- Chapter Summary --- p.28 / Chapter 3 --- System Overview --- p.29 / Chapter 3.1 --- Segmenter --- p.30 / Chapter 3.2 --- TEL Tagger --- p.31 / Chapter 3.3 --- Unknown Words Identifier --- p.32 / Chapter 3.4 --- Topic Tracker --- p.33 / Chapter 3.5 --- Chapter Summary --- p.34 / Chapter 4 --- Named Entity Tagging --- p.36 / Chapter 4.1 --- Experimental Data --- p.37 / Chapter 4.2 --- Transformational Tagging --- p.41 / Chapter 4.2.1 --- Notations --- p.41 / Chapter 4.2.2 --- Corpus Utilization --- p.42 / Chapter 4.2.3 --- Lexical Rules --- p.42 / Chapter 4.2.4 --- Contextual Rules --- p.47 / Chapter 4.3 --- Experiment and Result --- p.49 / Chapter 4.3.1 --- Lexical Tag Initialization --- p.50 / Chapter 4.3.2 --- Contribution of Lexical and Contextual Rules --- p.52 / Chapter 4.3.3 --- Performance on Unknown Words --- p.56 / Chapter 4.3.4 --- A Possible Benchmark --- p.57 / Chapter 4.3.5 --- Comparison between TEL Approach and the Stochas- tic Approach --- p.58 / Chapter 4.4 --- Chapter Summary --- p.59 / Chapter 5 --- Handling Unknown Words in Topic Tracking --- p.62 / Chapter 5.1 --- Overview --- p.63 / Chapter 5.2 --- Person Names --- p.64 / Chapter 5.2.1 --- Forming possible named entities from OOV by group- ing n-grams --- p.66 / Chapter 5.2.2 --- Overlapping --- p.69 / Chapter 5.3 --- Organization Names --- p.71 / Chapter 5.4 --- Location Names --- p.73 / Chapter 5.5 --- Dates and Times --- p.74 / Chapter 5.6 --- Chapter Summary --- p.75 / Chapter 6 --- Topic Tracking in Chinese --- p.77 / Chapter 6.1 --- Introduction of Topic Tracking --- p.78 / Chapter 6.2 --- Experimental Data --- p.79 / Chapter 6.3 --- Evaluation Methodology --- p.81 / Chapter 6.3.1 --- Cost Function --- p.82 / Chapter 6.3.2 --- DET Curve --- p.83 / Chapter 6.4 --- The Named Entity Approach --- p.85 / Chapter 6.4.1 --- Designing the Named Entities Set for Topic Tracking --- p.85 / Chapter 6.4.2 --- Feature Selection --- p.86 / Chapter 6.4.3 --- Integrated with Vector-Space Model --- p.87 / Chapter 6.5 --- Experimental Results and Analysis --- p.91 / Chapter 6.5.1 --- Notations --- p.92 / Chapter 6.5.2 --- Stopword Elimination --- p.92 / Chapter 6.5.3 --- TEL Tagging --- p.95 / Chapter 6.5.4 --- Unknown Word Identifier --- p.100 / Chapter 6.5.5 --- Error Analysis --- p.106 / Chapter 6.6 --- Chapter Summary --- p.108 / Chapter 7 --- Conclusions and Future Work --- p.110 / Chapter 7.1 --- Conclusions --- p.110 / Chapter 7.2 --- Future Work --- p.111 / Bibliography --- p.113 / Chapter A --- The POS Tags --- p.121 / Chapter B --- Surnames and transliterated characters --- p.123 / Chapter C --- Stopword List for Person Name --- p.126 / Chapter D --- Organization suffixes --- p.127 / Chapter E --- Location suffixes --- p.128 / Chapter F --- Examples of Feature Table (Train set with condition D410) --- p.129
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_323266 |
Date | January 2000 |
Contributors | Ip, Chun Wah Timmy., Chinese University of Hong Kong Graduate School. Division of Systems Engineering and Engineering Management. |
Source Sets | The Chinese University of Hong Kong |
Language | English, Chinese |
Detected Language | English |
Type | Text, bibliography |
Format | print, xii, 136 leaves : ill. (some col.) ; 30 cm. |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.0022 seconds