Global ETD Search

81	A new approach for extracting inter-word semantic relationship from a contemporary Chinese thesaurus. January 1995 (has links) by Lam Sze-sing. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1995. / Includes bibliographical references (leaves 119-123). / Chapter CHAPTER 1 --- INTRODUCTION --- p.1 / Chapter 1.1 --- Introduction --- p.1 / Chapter 1.2 --- Statement of Thesis --- p.5 / Chapter 1.3 --- Organization of this Thesis --- p.6 / Chapter CHAPTER 2 --- RELATED WORK --- p.8 / Chapter 2.1 --- Overview --- p.8 / Chapter 2.2 --- Corpus-Based Knowledge Acquisition --- p.12 / Chapter 2.3 --- Linguistic-Based Knowledge Acquisition --- p.18 / Chapter 2.3.1 --- Knowledge Acquisition from Standard Dictionaries --- p.18 / Chapter 2.3.2 --- Knowledge Acquisition from Standard Thesauri --- p.23 / Chapter 2.4 --- Remarks --- p.24 / Chapter CHAPTER 3 --- A METHOD TO EXTRACT THE INTER-WORD SEMANTIC RELATIONSHIP FROM《同義詞詞林》 --- p.25 / Chapter 3.1 --- Background --- p.25 / Chapter 3.1.1 --- Structure of《《同義詞詞林》 --- p.26 / Chapter 3.1.2 --- Knowledge Representation of a Machine Tractable Thesaurus --- p.28 / Chapter 3.1.3 --- Extracting the Semantic Knowledge by Simple Co-occurrence --- p.28 / Chapter 3.2 --- Association Network --- p.31 / Chapter 3.3 --- Semantic Association Model --- p.33 / Chapter 3.3.1 --- Problems with the Simple Co-occurrence Method --- p.34 / Chapter 3.3.2 --- Methodology of Semantic Association Model --- p.39 / Chapter 3.4 --- Inter-word Semantic Function ..… --- p.51 / Chapter CHAPTER 4 --- NOUN-VERB-NOUN COMPOUND WORD DETECTION : AN EXPERIMENT --- p.55 / Chapter 4.1 --- Overview --- p.56 / Chapter 4.2 --- N-V-N Compound Word Detection Model --- p.61 / Chapter 4.3 --- Experimental Results of N-V-N Compound Word Detection --- p.63 / Chapter CHAPTER 5 --- WORD SENSE DISAMBIGUATION : AN APPLICATION … --- p.66 / Chapter 5.1 --- Overview --- p.67 / Chapter 5.2 --- Word-Sense Disambiguation Model --- p.72 / Chapter 5.2.1 --- Linguistic Resource --- p.72 / Chapter 5.2.2 --- The LSD-C Algorithm --- p.73 / Chapter 5.2.3 --- LSD-C in Action --- p.78 / Chapter 5.3 --- Experimental Results of Word Sense Disambiguation --- p.83 / Chapter CHAPTER 6 --- CONCLUSIONS & FURTHER RESEARCH --- p.93 / Chapter 6.1 --- Conclusions --- p.93 / Chapter 6.2 --- Further Research --- p.96 / Chapter 6.2.1 --- Enriching the Knowledge --- p.96 / Chapter 6.2.2 --- Enhancing the N-V-N Compound Word Detection Model --- p.98 / Chapter 6.2.3 --- Enhancing the LSD-C Algorithm --- p.99 / APPENDICES --- p.101 / Appendix A - Dependency Grammar --- p.101 / Appendix B - Sample Articles from a Local Chinese Newspaper --- p.104 / Appendix C - Ambiguous Words with the Senses Given by《現代漢語詞典》 --- p.108 / Appendix D - List of Stop Words for the Testing Samples --- p.117 / REFERENCES --- p.119 Computational linguistics Chinese language--Data processing
82	Semi-automatic acquisition of domain-specific semantic structures. January 2000 (has links) Siu, Kai-Chung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 99-106). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Thesis Outline --- p.5 / Chapter 2 --- Background --- p.6 / Chapter 2.1 --- Natural Language Understanding --- p.6 / Chapter 2.1.1 --- Rule-based Approaches --- p.7 / Chapter 2.1.2 --- Stochastic Approaches --- p.8 / Chapter 2.1.3 --- Phrase-Spotting Approaches --- p.9 / Chapter 2.2 --- Grammar Induction --- p.10 / Chapter 2.2.1 --- Semantic Classification Trees --- p.11 / Chapter 2.2.2 --- Simulated Annealing --- p.12 / Chapter 2.2.3 --- Bayesian Grammar Induction --- p.12 / Chapter 2.2.4 --- Statistical Grammar Induction --- p.13 / Chapter 2.3 --- Machine Translation --- p.14 / Chapter 2.3.1 --- Rule-based Approach --- p.15 / Chapter 2.3.2 --- Statistical Approach --- p.15 / Chapter 2.3.3 --- Example-based Approach --- p.16 / Chapter 2.3.4 --- Knowledge-based Approach --- p.16 / Chapter 2.3.5 --- Evaluation Method --- p.19 / Chapter 3 --- Semi-Automatic Grammar Induction --- p.20 / Chapter 3.1 --- Agglomerative Clustering --- p.20 / Chapter 3.1.1 --- Spatial Clustering --- p.21 / Chapter 3.1.2 --- Temporal Clustering --- p.24 / Chapter 3.1.3 --- Free Parameters --- p.26 / Chapter 3.2 --- Post-processing --- p.27 / Chapter 3.3 --- Chapter Summary --- p.29 / Chapter 4 --- Application to the ATIS Domain --- p.30 / Chapter 4.1 --- The ATIS Domain --- p.30 / Chapter 4.2 --- Parameters Selection --- p.32 / Chapter 4.3 --- Unsupervised Grammar Induction --- p.35 / Chapter 4.4 --- Prior Knowledge Injection --- p.40 / Chapter 4.5 --- Evaluation --- p.43 / Chapter 4.5.1 --- Parse Coverage in Understanding --- p.45 / Chapter 4.5.2 --- Parse Errors --- p.46 / Chapter 4.5.3 --- Analysis --- p.47 / Chapter 4.6 --- Chapter Summary --- p.49 / Chapter 5 --- Portability to Chinese --- p.50 / Chapter 5.1 --- Corpus Preparation --- p.50 / Chapter 5.1.1 --- Tokenization --- p.51 / Chapter 5.2 --- Experiments --- p.52 / Chapter 5.2.1 --- Unsupervised Grammar Induction --- p.52 / Chapter 5.2.2 --- Prior Knowledge Injection --- p.56 / Chapter 5.3 --- Evaluation --- p.58 / Chapter 5.3.1 --- Parse Coverage in Understanding --- p.59 / Chapter 5.3.2 --- Parse Errors --- p.60 / Chapter 5.4 --- Grammar Comparison Across Languages --- p.60 / Chapter 5.5 --- Chapter Summary --- p.64 / Chapter 6 --- Bi-directional Machine Translation --- p.65 / Chapter 6.1 --- Bilingual Dictionary --- p.67 / Chapter 6.2 --- Concept Alignments --- p.68 / Chapter 6.3 --- Translation Procedures --- p.73 / Chapter 6.3.1 --- The Matching Process --- p.74 / Chapter 6.3.2 --- The Searching Process --- p.76 / Chapter 6.3.3 --- Heuristics to Aid Translation --- p.81 / Chapter 6.4 --- Evaluation --- p.82 / Chapter 6.4.1 --- Coverage --- p.83 / Chapter 6.4.2 --- Performance --- p.86 / Chapter 6.5 --- Chapter Summary --- p.89 / Chapter 7 --- Conclusions --- p.90 / Chapter 7.1 --- Summary --- p.90 / Chapter 7.2 --- Future Work --- p.92 / Chapter 7.2.1 --- Suggested Improvements on Grammar Induction Process --- p.92 / Chapter 7.2.2 --- Suggested Improvements on Bi-directional Machine Trans- lation --- p.96 / Chapter 7.2.3 --- Domain Portability --- p.97 / Chapter 7.3 --- Contributions --- p.97 / Bibliography --- p.99 / Chapter A --- Original SQL Queries --- p.107 / Chapter B --- Induced Grammar --- p.109 / Chapter C --- Seeded Categories --- p.111 Parsing (Computer grammar) Machine translating Chinese language--Data processing
83	Automatic construction of wrappers for semi-structured documents. January 2001 (has links) Lin Wai-yip. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2001. / Includes bibliographical references (leaves 114-123). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Information Extraction --- p.1 / Chapter 1.2 --- IE from Semi-structured Documents --- p.3 / Chapter 1.3 --- Thesis Contributions --- p.7 / Chapter 1.4 --- Thesis Organization --- p.9 / Chapter 2 --- Related Work --- p.11 / Chapter 2.1 --- Existing Approaches --- p.11 / Chapter 2.2 --- Limitations of Existing Approaches --- p.18 / Chapter 2.3 --- Our HISER Approach --- p.20 / Chapter 3 --- System Overview --- p.23 / Chapter 3.1 --- Hierarchical record Structure and Extraction Rule learning (HISER) --- p.23 / Chapter 3.2 --- Hierarchical Record Structure --- p.29 / Chapter 3.3 --- Extraction Rule --- p.29 / Chapter 3.4 --- Wrapper Adaptation --- p.32 / Chapter 4 --- Automatic Hierarchical Record Structure Construction --- p.34 / Chapter 4.1 --- Motivation --- p.34 / Chapter 4.2 --- Hierarchical Record Structure Representation --- p.36 / Chapter 4.3 --- Constructing Hierarchical Record Structure --- p.38 / Chapter 5 --- Extraction Rule Induction --- p.43 / Chapter 5.1 --- Rule Representation --- p.43 / Chapter 5.2 --- Extraction Rule Induction Algorithm --- p.47 / Chapter 6 --- Experimental Results of Wrapper Learning --- p.54 / Chapter 6.1 --- Experimental Methodology --- p.54 / Chapter 6.2 --- Results on Electronic Appliance Catalogs --- p.56 / Chapter 6.3 --- Results on Book Catalogs --- p.60 / Chapter 6.4 --- Results on Seminar Announcements --- p.62 / Chapter 7 --- Adapting Wrappers to Unseen Information Sources --- p.69 / Chapter 7.1 --- Motivation --- p.69 / Chapter 7.2 --- Support Vector Machines --- p.72 / Chapter 7.3 --- Feature Selection --- p.76 / Chapter 7.4 --- Automatic Annotation of Training Examples --- p.80 / Chapter 7.4.1 --- Building SVM Models --- p.81 / Chapter 7.4.2 --- Seeking Potential Training Example Candidates --- p.82 / Chapter 7.4.3 --- Classifying Potential Training Examples --- p.84 / Chapter 8 --- Experimental Results of Wrapper Adaptation --- p.86 / Chapter 8.1 --- Experimental Methodology --- p.86 / Chapter 8.2 --- Results on Electronic Appliance Catalogs --- p.89 / Chapter 8.3 --- Results on Book Catalogs --- p.93 / Chapter 9 --- Conclusions and Future Work --- p.97 / Chapter 9.1 --- Conclusions --- p.97 / Chapter 9.2 --- Future Work --- p.100 / Chapter A --- Sample Experimental Pages --- p.101 / Chapter B --- Detailed Experimental Results of Wrapper Adaptation of HISER --- p.109 / Bibliography --- p.114 Text processing (Computer science)
84	A computational framework for mixed-initiative dialog modeling. January 2002 (has links) Chan, Shuk Fong. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2002. / Includes bibliographical references (leaves 114-122). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview --- p.1 / Chapter 1.2 --- Thesis Contributions --- p.5 / Chapter 1.3 --- Thesis Outline --- p.9 / Chapter 2 --- Background --- p.10 / Chapter 2.1 --- Mixed-Initiative Interactions --- p.11 / Chapter 2.2 --- Mixed-Initiative Spoken Dialog Systems --- p.14 / Chapter 2.2.1 --- Finite-state Networks --- p.16 / Chapter 2.2.2 --- Form-based Approaches --- p.17 / Chapter 2.2.3 --- Sequential Decision Approaches --- p.18 / Chapter 2.2.4 --- Machine Learning Approaches --- p.20 / Chapter 2.3 --- Understanding Mixed-Initiative Dialogs --- p.24 / Chapter 2.4 --- Cooperative Response Generation --- p.26 / Chapter 2.4.1 --- Plan-based Approach --- p.27 / Chapter 2.4.2 --- Constraint-based Approach --- p.28 / Chapter 2.5 --- Chapter Summary --- p.29 / Chapter 3 --- Mixed-Initiative Dialog Management in the ISIS system --- p.30 / Chapter 3.1 --- The ISIS Domain --- p.31 / Chapter 3.1.1 --- System Overview --- p.31 / Chapter 3.1.2 --- Domain-Specific Constraints --- p.33 / Chapter 3.2 --- Discourse and Dialog --- p.34 / Chapter 3.2.1 --- Discourse Inheritance --- p.37 / Chapter 3.2.2 --- Mixed-Initiative Dialogs --- p.41 / Chapter 3.3 --- Challenges and New Directions --- p.45 / Chapter 3.3.1 --- A Learning System --- p.46 / Chapter 3.3.2 --- Combining Interaction and Delegation Subdialogs --- p.49 / Chapter 3.4 --- Chapter Summary --- p.57 / Chapter 4 --- Understanding Mixed-Initiative Human-Human Dialogs --- p.59 / Chapter 4.1 --- The CU Restaurants Domain --- p.60 / Chapter 4.2 --- "Task Goals, Dialog Acts, Categories and Annotation" --- p.61 / Chapter 4.2.1 --- Task Goals and Dialog Acts --- p.61 / Chapter 4.2.2 --- Semantic and Syntactic Categories --- p.64 / Chapter 4.2.3 --- Annotating the Training Sentences --- p.65 / Chapter 4.3 --- Selective Inheritance Strategy --- p.67 / Chapter 4.3.1 --- Category Inheritance Rules --- p.67 / Chapter 4.3.2 --- Category Refresh Rules --- p.73 / Chapter 4.4 --- Task Goal and Dialog Act Identification --- p.78 / Chapter 4.4.1 --- Belief Networks Development --- p.78 / Chapter 4.4.2 --- Varying the Input Dimensionality --- p.80 / Chapter 4.4.3 --- Evaluation --- p.80 / Chapter 4.5 --- Procedure for Discourse Inheritance --- p.83 / Chapter 4.6 --- Chapter Summary --- p.86 / Chapter 5 --- Cooperative Response Generation in Mixed-Initiative Dialog Modeling --- p.88 / Chapter 5.1 --- System Overview --- p.89 / Chapter 5.1.1 --- State Space Generation --- p.89 / Chapter 5.1.2 --- Task Goal and Dialog Act Generation for System Response --- p.92 / Chapter 5.1.3 --- Response Frame Generation --- p.93 / Chapter 5.1.4 --- Text Generation --- p.100 / Chapter 5.2 --- Experiments and Results --- p.100 / Chapter 5.2.1 --- Subjective Results --- p.103 / Chapter 5.2.2 --- Objective Results --- p.105 / Chapter 5.3 --- Chapter Summary --- p.105 / Chapter 6 --- Conclusions --- p.108 / Chapter 6.1 --- Summary --- p.108 / Chapter 6.2 --- Contributions --- p.110 / Chapter 6.3 --- Future Work --- p.111 / Bibliography --- p.113 / Chapter A --- Domain-Specific Task Goals in CU Restaurants Domain --- p.123 / Chapter B --- Full list of VERBMOBIL-2 Dialog Acts --- p.124 / Chapter C --- Dialog Acts for Customer Requests and Waiter Responses in CU Restaurants Domain --- p.125 / Chapter D --- The Two Grammers for Task Goal and Dialog Act Identifi- cation --- p.130 / Chapter E --- Category Inheritance Rules --- p.143 / Chapter F --- Category Refresh Rules --- p.149 / Chapter G --- Full list of Response Trigger Words --- p.154 / Chapter H --- Evaluation Test Questionnaire for Dialog System in CU Restaurants Domain --- p.159 / Chapter I --- Details of the statistical testing Regarding Grice's Maxims and User Satisfaction --- p.161 Speech processing systems Question-answering systems Human-computer interaction
85	Semi-automatic grammar induction for bidirectional machine translation. January 2002 (has links) Wong, Chin Chung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2002. / Includes bibliographical references (leaves 137-143). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Objectives --- p.3 / Chapter 1.2 --- Thesis Outline --- p.5 / Chapter 2 --- Background in Natural Language Understanding --- p.6 / Chapter 2.1 --- Rule-based Approaches --- p.7 / Chapter 2.2 --- Corpus-based Approaches --- p.8 / Chapter 2.2.1 --- Stochastic Approaches --- p.8 / Chapter 2.2.2 --- Phrase-spotting Approaches --- p.9 / Chapter 2.3 --- The ATIS Domain --- p.10 / Chapter 2.3.1 --- Chinese Corpus Preparation --- p.11 / Chapter 3 --- Semi-automatic Grammar Induction - Baseline Approach --- p.13 / Chapter 3.1 --- Background in Grammar Induction --- p.13 / Chapter 3.1.1 --- Simulated Annealing --- p.14 / Chapter 3.1.2 --- Bayesian Grammar Induction --- p.14 / Chapter 3.1.3 --- Probabilistic Grammar Acquisition --- p.15 / Chapter 3.2 --- Semi-automatic Grammar Induction 一 Baseline Approach --- p.16 / Chapter 3.2.1 --- Spatial Clustering --- p.16 / Chapter 3.2.2 --- Temporal Clustering --- p.18 / Chapter 3.2.3 --- Post-processing --- p.19 / Chapter 3.2.4 --- Four Aspects for Enhancements --- p.20 / Chapter 3.3 --- Chapter Summary --- p.22 / Chapter 4 --- Semi-automatic Grammar Induction - Enhanced Approach --- p.23 / Chapter 4.1 --- Evaluating Induced Grammars --- p.24 / Chapter 4.2 --- Stopping Criterion --- p.26 / Chapter 4.2.1 --- Cross-checking with Recall Values --- p.29 / Chapter 4.3 --- Improvements on Temporal Clustering --- p.32 / Chapter 4.3.1 --- Evaluation --- p.39 / Chapter 4.4 --- Improvements on Spatial Clustering --- p.46 / Chapter 4.4.1 --- Distance Measures --- p.48 / Chapter 4.4.2 --- Evaluation --- p.57 / Chapter 4.5 --- Enhancements based on Intelligent Selection --- p.62 / Chapter 4.5.1 --- Informed Selection between Spatial Clustering and Tem- poral Clustering --- p.62 / Chapter 4.5.2 --- Selecting the Number of Clusters Per Iteration --- p.64 / Chapter 4.5.3 --- An Example for Intelligent Selection --- p.64 / Chapter 4.5.4 --- Evaluation --- p.68 / Chapter 4.6 --- Chapter Summary --- p.71 / Chapter 5 --- Bidirectional Machine Translation using Induced Grammars ´ؤBaseline Approach --- p.73 / Chapter 5.1 --- Background in Machine Translation --- p.75 / Chapter 5.1.1 --- Rule-based Machine Translation --- p.75 / Chapter 5.1.2 --- Statistical Machine Translation --- p.76 / Chapter 5.1.3 --- Knowledge-based Machine Translation --- p.77 / Chapter 5.1.4 --- Example-based Machine Translation --- p.78 / Chapter 5.1.5 --- Evaluation --- p.79 / Chapter 5.2 --- Baseline Configuration on Bidirectional Machine Translation System --- p.84 / Chapter 5.2.1 --- Bilingual Dictionary --- p.84 / Chapter 5.2.2 --- Concept Alignments --- p.85 / Chapter 5.2.3 --- Translation Process --- p.89 / Chapter 5.2.4 --- Two Aspects for Enhancements --- p.90 / Chapter 5.3 --- Chapter Summary --- p.91 / Chapter 6 --- Bidirectional Machine Translation ´ؤ Enhanced Approach --- p.92 / Chapter 6.1 --- Concept Alignments --- p.93 / Chapter 6.1.1 --- Enhanced Alignment Scheme --- p.95 / Chapter 6.1.2 --- Experiment --- p.97 / Chapter 6.2 --- Grammar Checker --- p.100 / Chapter 6.2.1 --- Components for Grammar Checking --- p.101 / Chapter 6.3 --- Evaluation --- p.117 / Chapter 6.3.1 --- Bleu Score Performance --- p.118 / Chapter 6.3.2 --- Modified Bleu Score --- p.122 / Chapter 6.4 --- Chapter Summary --- p.130 / Chapter 7 --- Conclusions --- p.131 / Chapter 7.1 --- Summary --- p.131 / Chapter 7.2 --- Contributions --- p.134 / Chapter 7.3 --- Future work --- p.136 / Bibliography --- p.137 / Chapter A --- Original SQL Queries --- p.144 / Chapter B --- Seeded Categories --- p.146 / Chapter C --- 3 Alignment Categories --- p.147 / Chapter D --- Labels of Syntactic Structures in Grammar Checker --- p.148 Machine translating Computer algorithms Grammar, Comparative and general--Syntax
86	Extracting causation knowledge from natural language texts. January 2002 (has links) Chan Ki, Cecia. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2002. / Includes bibliographical references (leaves 95-99). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Our Contributions --- p.4 / Chapter 1.2 --- Thesis Organization --- p.5 / Chapter 2 --- Related Work --- p.6 / Chapter 2.1 --- Using Knowledge-based Inferences --- p.7 / Chapter 2.2 --- Using Linguistic Techniques --- p.8 / Chapter 2.2.1 --- Using Linguistic Clues --- p.8 / Chapter 2.2.2 --- Using Graphical Patterns --- p.9 / Chapter 2.2.3 --- Using Lexicon-syntactic Patterns of Causative Verbs --- p.10 / Chapter 2.2.4 --- Comparisons with Our Approach --- p.10 / Chapter 2.3 --- Discovery of Extraction Patterns for Extracting Relations --- p.11 / Chapter 2.3.1 --- Snowball system --- p.12 / Chapter 2.3.2 --- DIRT system --- p.12 / Chapter 2.3.3 --- Comparisons with Our Approach --- p.13 / Chapter 3 --- Semantic Expectation-based Knowledge Extraction --- p.14 / Chapter 3.1 --- Semantic Expectations --- p.14 / Chapter 3.2 --- Semantic Template --- p.16 / Chapter 3.2.1 --- Causation Semantic Template --- p.16 / Chapter 3.3 --- Sentence Templates --- p.17 / Chapter 3.4 --- Consequence and Reason Templates --- p.22 / Chapter 3.5 --- Causation Knowledge Extraction Framework --- p.25 / Chapter 3.5.1 --- Template Design --- p.25 / Chapter 3.5.2 --- Sentence Screening --- p.27 / Chapter 3.5.3 --- Semantic Processing --- p.28 / Chapter 4 --- Using Thesaurus and Pattern Discovery for SEKE --- p.33 / Chapter 4.1 --- Using a Thesaurus --- p.34 / Chapter 4.2 --- Pattern Discovery --- p.37 / Chapter 4.2.1 --- Use of Semantic Expectation-based Knowledge Extraction --- p.37 / Chapter 4.2.2 --- Use of Part of Speech Information --- p.39 / Chapter 4.2.3 --- Pattern Representation --- p.39 / Chapter 4.2.4 --- Constructing the Patterns --- p.40 / Chapter 4.2.5 --- Merging the Patterns --- p.43 / Chapter 4.3 --- Pattern Matching --- p.44 / Chapter 4.3.1 --- Matching Score --- p.46 / Chapter 4.3.2 --- Support of Patterns --- p.48 / Chapter 4.3.3 --- Relevancy of Sentence Templates --- p.48 / Chapter 4.4 --- Applying the Newly Discovered Patterns --- p.49 / Chapter 5 --- Applying SEKE on Hong Kong Stock Market Domain --- p.52 / Chapter 5.1 --- Template Design --- p.53 / Chapter 5.1.1 --- Semantic Templates --- p.53 / Chapter 5.1.2 --- Sentence Templates --- p.53 / Chapter 5.1.3 --- Consequence and Reason Templates: --- p.55 / Chapter 5.2 --- Pattern Discovery --- p.58 / Chapter 5.2.1 --- Support of Patterns --- p.58 / Chapter 5.2.2 --- Relevancy of Sentence Templates --- p.58 / Chapter 5.3 --- Causation Knowledge Extraction Result --- p.58 / Chapter 5.3.1 --- Evaluation Approach --- p.61 / Chapter 5.3.2 --- Parameter Investigations --- p.61 / Chapter 5.3.3 --- Experimental Results --- p.65 / Chapter 5.3.4 --- Knowledge Discovered --- p.68 / Chapter 5.3.5 --- Parameter Effect --- p.75 / Chapter 6 --- Applying SEKE on Global Warming Domain --- p.80 / Chapter 6.1 --- Template Design --- p.80 / Chapter 6.1.1 --- Semantic Templates --- p.81 / Chapter 6.1.2 --- Sentence Templates --- p.81 / Chapter 6.1.3 --- Consequence and Reason Templates --- p.83 / Chapter 6.2 --- Pattern Discovery --- p.85 / Chapter 6.2.1 --- Support of Patterns --- p.85 / Chapter 6.2.2 --- Relevancy of Sentence Templates --- p.85 / Chapter 6.3 --- Global Warming Domain Result --- p.85 / Chapter 6.3.1 --- Evaluation Approach --- p.85 / Chapter 6.3.2 --- Experimental Results --- p.88 / Chapter 6.3.3 --- Knowledge Discovered --- p.89 / Chapter 7 --- Conclusions and Future Directions --- p.92 / Chapter 7.1 --- Conclusions --- p.92 / Chapter 7.2 --- Future Directions --- p.93 / Bibliography --- p.95 / Chapter A --- Penn Treebank Part of Speech Tags --- p.100 Text processing (Computer science) Semantics--Data processing Computational linguistics
87	Automatic text categorization for information filtering. January 1998 (has links) Ho Chao Yang. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 157-163). / Abstract also in Chinese. / Abstract --- p.i / Acknowledgment --- p.iii / List of Figures --- p.viii / List of Tables --- p.xiv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Automatic Document Categorization --- p.1 / Chapter 1.2 --- Information Filtering --- p.3 / Chapter 1.3 --- Contributions --- p.6 / Chapter 1.4 --- Organization of the Thesis --- p.7 / Chapter 2 --- Related Work --- p.9 / Chapter 2.1 --- Existing Automatic Document Categorization Approaches --- p.9 / Chapter 2.1.1 --- Rule-Based Approach --- p.10 / Chapter 2.1.2 --- Similarity-Based Approach --- p.13 / Chapter 2.2 --- Existing Information Filtering Approaches --- p.19 / Chapter 2.2.1 --- Information Filtering Systems --- p.19 / Chapter 2.2.2 --- Filtering in TREC --- p.21 / Chapter 3 --- Document Pre-Processing --- p.23 / Chapter 3.1 --- Document Representation --- p.23 / Chapter 3.2 --- Classification Scheme Learning Strategy --- p.26 / Chapter 4 --- A New Approach - IBRI --- p.31 / Chapter 4.1 --- Overview of Our New IBRI Approach --- p.31 / Chapter 4.2 --- The IBRI Representation and Definitions --- p.34 / Chapter 4.3 --- The IBRI Learning Algorithm --- p.37 / Chapter 5 --- IBRI Experiments --- p.43 / Chapter 5.1 --- Experimental Setup --- p.43 / Chapter 5.2 --- Evaluation Metric --- p.45 / Chapter 5.3 --- Results --- p.46 / Chapter 6 --- A New Approach - GIS --- p.50 / Chapter 6.1 --- Motivation of GIS --- p.50 / Chapter 6.2 --- Similarity-Based Learning --- p.51 / Chapter 6.3 --- The Generalized Instance Set Algorithm (GIS) --- p.58 / Chapter 6.4 --- Using GIS Classifiers for Classification --- p.63 / Chapter 6.5 --- Time Complexity --- p.64 / Chapter 7 --- GIS Experiments --- p.68 / Chapter 7.1 --- Experimental Setup --- p.68 / Chapter 7.2 --- Results --- p.73 / Chapter 8 --- A New Information Filtering Approach Based on GIS --- p.87 / Chapter 8.1 --- Information Filtering Systems --- p.87 / Chapter 8.2 --- GIS-Based Information Filtering --- p.90 / Chapter 9 --- Experiments on GIS-based Information Filtering --- p.95 / Chapter 9.1 --- Experimental Setup --- p.95 / Chapter 9.2 --- Results --- p.100 / Chapter 10 --- Conclusions and Future Work --- p.108 / Chapter 10.1 --- Conclusions --- p.108 / Chapter 10.2 --- Future Work --- p.110 / Chapter A --- Sample Documents in the corpora --- p.111 / Chapter B --- Details of Experimental Results of GIS --- p.120 / Chapter C --- Computational Time of Reuters-21578 Experiments --- p.141 Text processing (Computer science) Nearest neighbor analysis (Statistics) Information retrieval
88	Realization of automatic concept extraction for Chinese conceptual information retrieval =: 中文槪念訊息檢索中自動槪念抽取的實踐. / 中文槪念訊息檢索中自動槪念抽取的實踐 / Realization of automatic concept extraction for Chinese conceptual information retrieval =: Zhong wen gai nian xun xi jian suo zhong zi dong gai nian chou qu de shi jian. / Zhong wen gai nian xun xi jian suo zhong zi dong gai nian chou qu de shi jian January 1998 (has links) Wai Ip Lam. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 84-87). / Text in English; abstract also in Chinese. / Wai Ip Lam. / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Background --- p.5 / Chapter 2.1 --- Information Retrieval --- p.5 / Chapter 2.1.1 --- Index Extraction --- p.6 / Chapter 2.1.2 --- Other Approaches to Extracting Indexes --- p.7 / Chapter 2.1.3 --- Conceptual Information Retrieval --- p.8 / Chapter 2.1.4 --- Information Extraction --- p.9 / Chapter 2.2 --- Natural Language Parsing --- p.9 / Chapter 2.2.1 --- Linguistics-based --- p.10 / Chapter 2.2.2 --- Corpus-based --- p.11 / Chapter 3 --- Concept Extraction --- p.13 / Chapter 3.1 --- Concepts in Sentences --- p.13 / Chapter 3.1.1 --- Semantic Structures and Themantic Roles --- p.13 / Chapter 3.1.2 --- Syntactic Functions --- p.14 / Chapter 3.2 --- Representing Concepts --- p.15 / Chapter 3.3 --- Application to Conceptual Information Retrieval --- p.18 / Chapter 3.4 --- Overview of Our Concept Extraction Model --- p.20 / Chapter 3.4.1 --- Corpus Training --- p.21 / Chapter 3.4.2 --- Sentence Analyzing --- p.22 / Chapter 4 --- Noun Phrase Detection --- p.23 / Chapter 4.1 --- Significance of Noun Phrase Detection --- p.23 / Chapter 4.1.1 --- Noun Phrases versus Terminals in Parse Trees --- p.23 / Chapter 4.1.2 --- Quantitative Analysis of Applying Noun Phrase Detection --- p.26 / Chapter 4.2 --- An Algorithm for Chinese Noun Phrase Partial Parsing --- p.28 / Chapter 4.2.1 --- The Hybrid Approach --- p.28 / Chapter 4.2.2 --- CNP3´ؤThe Chinese NP Partial Parser --- p.30 / Chapter 5 --- Rule Extraction and SVO Parsing --- p.35 / Chapter 5.1 --- Annotation of Corpora --- p.36 / Chapter 5.1.1 --- Components of Chinese Sentence Patterns --- p.36 / Chapter 5.1.2 --- Annotating Sentence Structures --- p.37 / Chapter 5.1.3 --- Illustrative Examples --- p.38 / Chapter 5.2 --- Parsing with Rules Obtained Directly from Corpora --- p.43 / Chapter 5.2.1 --- Extracting Rules --- p.43 / Chapter 5.2.2 --- Parsing --- p.44 / Chapter 5.3 --- Using Word Specific Information --- p.45 / Chapter 6 --- Generalization of Rules --- p.48 / Chapter 6.1 --- Essence of Chinese Linguistics on Generalization --- p.49 / Chapter 6.1.1 --- Classification of Chinese Sentence Patterns --- p.50 / Chapter 6.1.2 --- Revision of Chinese Verb Phrase Classification --- p.52 / Chapter 6.2 --- Initial Generalization --- p.53 / Chapter 6.2.1 --- Generalizing Rules --- p.55 / Chapter 6.2.2 --- Dealing with Alternative Results --- p.58 / Chapter 6.2.3 --- Parsing --- p.58 / Chapter 6.2.4 --- An illustrative Example --- p.59 / Chapter 6.3 --- Further Generalization --- p.60 / Chapter 7 --- Experiments on SVO Parsing --- p.62 / Chapter 7.1 --- Experimental Setup --- p.63 / Chapter 7.2 --- Effect of Adopting Noun Phrase Detection --- p.65 / Chapter 7.3 --- Results of Generalization --- p.68 / Chapter 7.4 --- Reliability Evaluation --- p.69 / Chapter 7.4.1 --- Covergence Sequence Tests --- p.69 / Chapter 7.4.2 --- Cross Evaluation Tests --- p.72 / Chapter 7.5 --- Overall Performance --- p.75 / Chapter 8 --- Conclusions --- p.79 / Chapter 8.1 --- Summary --- p.79 / Chapter 8.2 --- Contribution --- p.81 / Chapter 8.3 --- Future Directions --- p.81 / Chapter 8.3.1 --- Improvements in Parsing --- p.81 / Chapter 8.3.2 --- Concept Representations --- p.82 / Chapter 8.3.3 --- Non-IR Applications --- p.83 / Bibliography --- p.84 / Appendix --- p.88 / Chapter A --- The Extended Part of Speech Tag Set --- p.88 Chinese language--Data processing Information retrieval Parsing (Computer grammar)
89	Pivot-based Statistical Machine Translation for Morphologically Rich Languages Kholy, Ahmed El January 2016 (has links) This thesis describes the research efforts on pivot-based statistical machine translation (SMT) for morphologically rich languages (MRL). We provide a framework to translate to and from morphologically rich languages especially in the context of having little or no parallel corpora between the source and the target languages. We basically address three main challenges. The first one is the sparsity of data as a result of morphological richness. The second one is maximizing the precision and recall of the pivoting process itself. And the last one is making use of any parallel data between the source and the target languages. To address the challenge of data sparsity, we explored a space of tokenization schemes and normalization options. We also examined a set of six detokenization techniques to evaluate detokenized and orthographically corrected (enriched) output. We provide a recipe of the best settings to translate to one of the most challenging languages, namely Arabic. Our best model improves the translation quality over the baseline by 1.3 BLEU points. We also investigated the idea of separation between translation and morphology generation. We compared three methods of modeling morphological features. Features can be modeled as part of the core translation. Alternatively these features can be generated using target monolingual context. Finally, the features can be predicted using both source and target information. In our experimental results, we outperform the vanilla factored translation model. In order to decide on which features to translate, generate or predict, a detailed error analysis should be provided on the system output. As a result, we present AMEANA, an open-source tool for error analysis of natural language processing tasks, targeting morphologically rich languages. The second challenge we are concerned with is the pivoting process itself. We discuss several techniques to improve the precision and recall of the pivot matching. One technique to improve the recall works on the level of the word alignment as an optimization process for pivoting driven by generating phrase pairs between source and target languages. Despite the fact that improving the recall of the pivot matching improves the overall translation quality, we also need to increase the precision of the pivot quality. To achieve this, we introduce quality constraints scores to determine the quality of the pivot phrase pairs between source and target languages. We show positive results for different language pairs which shows the consistency of our approaches. In one of our best models we reach an improvement of 1.2 BLEU points. The third challenge we are concerned with is how to make use of any parallel data between the source and the target languages. We build on the approach of improving the precision of the pivoting process and the methods of combination between the pivot system and the direct system built from the parallel data. In one of the approaches, we introduce morphology constraint scores which are added to the log linear space of features in order to determine the quality of the pivot phrase pairs. We compare two methods of generating the morphology constraints. One method is based on hand-crafted rules relying on our knowledge of the source and target languages; while in the other method, the morphology constraints are induced from available parallel data between the source and target languages which we also use to build a direct translation model. We then combine both the pivot and direct models to achieve better coverage and overall translation quality. Using induced morphology constraints outperformed the handcrafted rules and improved over our best model from all previous approaches by 0.6 BLEU points (7.2/6.7 BLEU points from the direct and pivot baselines respectively). Finally, we introduce applying smart techniques to combine pivot and direct models. We show that smart selective combination can lead to a large reduction of the pivot model without affecting the performance and in some cases improving it. Arabic language Machine translating Computer science
90	Social Network Extraction from Text Agarwal, Apoorv January 2016 (has links) In the pre-digital age, when electronically stored information was non-existent, the only ways of creating representations of social networks were by hand through surveys, inter- views, and observations. In this digital age of the internet, numerous indications of social interactions and associations are available electronically in an easy to access manner as structured meta-data. This lessens our dependence on manual surveys and interviews for creating and studying social networks. However, there are sources of networks that remain untouched simply because they are not associated with any meta-data. Primary examples of such sources include the vast amounts of literary texts, news articles, content of emails, and other forms of unstructured and semi-structured texts. The main contribution of this thesis is the introduction of natural language processing and applied machine learning techniques for uncovering social networks in such sources of unstructured and semi-structured texts. Specifically, we propose three novel techniques for mining social networks from three types of texts: unstructured texts (such as literary texts), emails, and movie screenplays. For each of these types of texts, we demonstrate the utility of the extracted networks on three applications (one for each type of text). Machine learning Social networks Computer science

Search results