by Leung Chi Hong. / Thesis (Ph.D.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (leaves 163-171). / Chapter Chapter 1. --- Introduction --- p.1 / Chapter Chapter 2. --- Background Study of Natural Language Processing --- p.9 / Chapter 2.1. --- Knowledge-based approach --- p.9 / Chapter 2.1.1. --- Morphological analysis --- p.10 / Chapter 2.1.2. --- Syntactic parsing --- p.11 / Chapter 2.1.3. --- Semantic parsing --- p.16 / Chapter 2.1.3.1. --- Semantic grammar --- p.19 / Chapter 2.1.3.2. --- Case grammar --- p.20 / Chapter 2.1.4. --- Problems of knowledge acquisition in knowledge-based approach --- p.22 / Chapter 2.2. --- Corpus-based approach --- p.23 / Chapter 2.2.1. --- Beginning of corpus-based approach --- p.23 / Chapter 2.2.2. --- An example of corpus-based application: word tagging --- p.25 / Chapter 2.2.3. --- Annotated corpus --- p.26 / Chapter 2.2.4. --- State of the art in the corpus-based approach --- p.26 / Chapter 2.3. --- Knowledge-based approach versus corpus-based approach --- p.28 / Chapter 2.4. --- Co-operation between two different approaches --- p.32 / Chapter Chapter 3. --- Induction Learning applied to Corpus-based Approach --- p.35 / Chapter 3.1. --- General model of traditional corpus-based approach --- p.36 / Chapter 3.1.1. --- Division of a problem into a number of sub-problems --- p.36 / Chapter 3.1.2. --- Solution selected from a set of predefined choices --- p.36 / Chapter 3.1.3. --- Solution selection based on a particular kind of linguistic entity --- p.37 / Chapter 3.1.4. --- Statistical correlations between solutions and linguistic entities --- p.37 / Chapter 3.1.5. --- Prediction of the best solution based on statistical correlations --- p.38 / Chapter 3.2. --- First problem in the corpus-based approach: Irrelevance in the corpus --- p.39 / Chapter 3.3. --- Induction learning --- p.41 / Chapter 3.3.1. --- General issues about induction learning --- p.41 / Chapter 3.3.2. --- Reasons of using induction learning in the corpus-based approach --- p.43 / Chapter 3.3.3. --- General model of corpus-based induction learning approach --- p.45 / Chapter 3.3.3.1. --- Preparation of positive corpus and negative corpus --- p.45 / Chapter 3.3.3.2. --- Statistical correlations between solutions and linguistic entities --- p.46 / Chapter 3.3.3.3. --- Combination of the statistical correlations obtained from the positive and negative corpora --- p.48 / Chapter 3.4. --- Second problem in the corpus-based approach: Modification of initial probabilistic approximations --- p.50 / Chapter 3.5. --- Learning feedback modification --- p.52 / Chapter 3.5.1. --- Determination of which correlation scores to be modified --- p.52 / Chapter 3.5.2. --- Determination of the magnitude of modification --- p.53 / Chapter 3.5.3. --- An general algorithm of learning feedback modification --- p.56 / Chapter Chapter 4. --- Identification of Phrases and Templates in Domain-specific Chinese Texts --- p.59 / Chapter 4.1. --- Analysis of the problem solved by the traditional corpus-based approach --- p.61 / Chapter 4.2. --- Phrase identification based on positive and negative corpora --- p.63 / Chapter 4.3. --- Phrase identification procedure --- p.64 / Chapter 4.3.1. --- Step 1: Phrase seed identification --- p.65 / Chapter 4.3.2. --- Step 2: Phrase construction from phrase seeds --- p.65 / Chapter 4.4. --- Template identification procedure --- p.67 / Chapter 4.5. --- Experiment and result --- p.70 / Chapter 4.5.1. --- Testing data --- p.70 / Chapter 4.5.2. --- Details of experiments --- p.71 / Chapter 4.5.3. --- Experimental results --- p.72 / Chapter 4.5.3.1. --- Phrases and templates identified in financial news articles --- p.72 / Chapter 4.5.3.2. --- Phrases and templates identified in political news articles --- p.73 / Chapter 4.6. --- Conclusion --- p.74 / Chapter Chapter 5. --- A Corpus-based Induction Learning Approach to Improving the Accuracy of Chinese Word Segmentation --- p.76 / Chapter 5.1. --- Background of Chinese word segmentation --- p.77 / Chapter 5.2. --- Typical methods of Chinese word segmentation --- p.78 / Chapter 5.2.1. --- Syntactic and semantic approach --- p.78 / Chapter 5.2.2. --- Statistical approach --- p.79 / Chapter 5.2.3. --- Heuristic approach --- p.81 / Chapter 5.3. --- Problems in word segmentation --- p.82 / Chapter 5.3.1. --- Chinese word definition --- p.82 / Chapter 5.3.2. --- Word dictionary --- p.83 / Chapter 5.3.3. --- Word segmentation ambiguity --- p.84 / Chapter 5.4. --- Corpus-based induction learning approach to improving word segmentation accuracy --- p.86 / Chapter 5.4.1. --- Rationale of approach --- p.87 / Chapter 5.4.2. --- Method of constructing modification rules --- p.89 / Chapter 5.5. --- Experiment and results --- p.94 / Chapter 5.6. --- Characteristics of modification rules constructed in experiment --- p.96 / Chapter 5.7. --- Experiment constructing rules for compound words with suffixes --- p.98 / Chapter 5.8. --- Relationship between modification frequency and Zipfs first law --- p.99 / Chapter 5.9. --- Problems in the approach --- p.100 / Chapter 5.10. --- Conclusion --- p.101 / Chapter Chapter 6. --- Corpus-based Induction Learning Approach to Automatic Indexing of Controlled Index Terms --- p.103 / Chapter 6.1. --- Background of automatic indexing --- p.103 / Chapter 6.1.1. --- Definition of index term and indexing --- p.103 / Chapter 6.1.2. --- Manual indexing versus automatic indexing --- p.105 / Chapter 6.1.3. --- Different approaches to automatic indexing --- p.107 / Chapter 6.2. --- Corpus-based induction learning approach to automatic indexing --- p.109 / Chapter 6.2.1. --- Fundamental concept about corpus-based automatic indexing --- p.110 / Chapter 6.2.2. --- Procedure of automatic indexing --- p.111 / Chapter 6.2.2.1. --- Learning process --- p.112 / Chapter 6.2.2.2. --- Indexing process --- p.118 / Chapter 6.3. --- Experiments of corpus-based induction learning approach to automatic indexing --- p.118 / Chapter 6.3.1. --- An experiment evaluating the complete procedures --- p.119 / Chapter 6.3.1.1. --- Testing data used in the experiment --- p.119 / Chapter 6.3.1.2. --- Details of the experiment --- p.119 / Chapter 6.3.1.3. --- Experimental result --- p.121 / Chapter 6.3.2. --- An experiment comparing with the traditional approach --- p.122 / Chapter 6.3.3. --- An experiment determining the optimal indexing score threshold --- p.124 / Chapter 6.3.4. --- An experiment measuring the precision and recall of indexing performance --- p.127 / Chapter 6.4. --- Learning feedback modification --- p.128 / Chapter 6.4.1. --- Positive feedback --- p.129 / Chapter 6.4.2. --- Negative feedback --- p.131 / Chapter 6.4.3. --- Change of indexed proportions of positive/negative training corpus in feedback iterations --- p.132 / Chapter 6.4.4. --- An experiment evaluating the learning feedback modification --- p.134 / Chapter 6.4.5. --- An experiment testing the significance factor in merging process --- p.136 / Chapter 6.5. --- Conclusion --- p.138 / Chapter Chapter 7. --- Conclusion --- p.140 / Appendix A: Some examples of identified phrases in financial news articles --- p.149 / Appendix B: Some examples of identified templates in financial news articles --- p.150 / Appendix C: Some examples of texts containing the templates in financial news articles --- p.151 / Appendix D: Some examples of identified phrases in political news articles --- p.152 / Appendix E: Some examples of identified templates in political news articles --- p.153 / Appendix F: Some examples of texts containing the templates in political news articles --- p.154 / Appendix G: Syntactic tags used in word segmentation modification rule experiment --- p.155 / Appendix H: An example of semantic approach to automatic indexing --- p.156 / Appendix I: An example of syntactic approach to automatic indexing --- p.158 / Appendix J: Samples of INSPEC and MEDLINE Records --- p.161 / Appendix K: Examples of Promoting and Demoting Words --- p.162 / References --- p.163
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_321659 |
Date | January 1996 |
Contributors | Leung, Chi Hong., Chinese University of Hong Kong Graduate School. Division of Computer Science and Engineering. |
Publisher | Chinese University of Hong Kong |
Source Sets | The Chinese University of Hong Kong |
Language | English |
Detected Language | English |
Type | Text, bibliography |
Format | print, vii, 171 leaves : ill. ; 30 cm. |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.003 seconds