Lau Tak Pang. / Thesis submitted in: October 2006. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2007. / Includes bibliographical references (leaves 110-122). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.v / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Motivation and Major Contributions --- p.1 / Chapter 1.1.1 --- Chinese Readability Analysis --- p.1 / Chapter 1.1.2 --- Web Readability Analysis --- p.3 / Chapter 1.2 --- Thesis Chapter Organization --- p.6 / Chapter 2 --- Related Work --- p.7 / Chapter 2.1 --- Readability Assessment --- p.7 / Chapter 2.1.1 --- Assessment for Text Document --- p.8 / Chapter 2.1.2 --- Assessment for Web Page --- p.13 / Chapter 2.2 --- Support Vector Machine --- p.14 / Chapter 2.2.1 --- Characteristics and Advantages --- p.14 / Chapter 2.2.2 --- Applications --- p.16 / Chapter 2.3 --- Chinese Word Segmentation --- p.16 / Chapter 2.3.1 --- Difficulty in Chinese Word Segmentation --- p.16 / Chapter 2.3.2 --- Approaches for Chinese Word Segmentation --- p.17 / Chapter 3 --- Chinese Readability Analysis --- p.20 / Chapter 3.1 --- Chinese Readability Factor Analysis --- p.20 / Chapter 3.1.1 --- Systematic Analysis --- p.20 / Chapter 3.1.2 --- Feature Extraction --- p.30 / Chapter 3.1.3 --- Limitation of Our Analysis and Possible Extension --- p.32 / Chapter 3.2 --- Research Methodology --- p.33 / Chapter 3.2.1 --- Definition of Readability --- p.33 / Chapter 3.2.2 --- Data Acquisition and Sampling --- p.34 / Chapter 3.2.3 --- Text Processing and Feature Extraction . --- p.35 / Chapter 3.2.4 --- Regression Analysis using Support Vector Regression --- p.36 / Chapter 3.2.5 --- Evaluation --- p.36 / Chapter 3.3 --- Introduction to Support Vector Regression --- p.38 / Chapter 3.3.1 --- Basic Concept --- p.38 / Chapter 3.3.2 --- Non-Linear Extension using Kernel Technique --- p.41 / Chapter 3.4 --- Implementation Details --- p.42 / Chapter 3.4.1 --- Chinese Word Segmentation --- p.42 / Chapter 3.4.2 --- Building Basic Chinese Character / Word Lists --- p.47 / Chapter 3.4.3 --- Pull Sentence Detection --- p.49 / Chapter 3.4.4 --- Feature Selection Using Genetic Algorithm --- p.50 / Chapter 3.5 --- Experiments --- p.55 / Chapter 3.5.1 --- Experiment 1: Evaluation on Chinese Word Segmentation using the LMR-RC Tagging Scheme --- p.56 / Chapter 3.5.2 --- Experiment 2: Initial SVR Parameters Searching with Different Kernel Functions --- p.61 / Chapter 3.5.3 --- Experiment 3: Feature Selection Using Genetic Algorithm --- p.63 / Chapter 3.5.4 --- Experiment 4: Training and Cross-validation Performance using the Selected Feature Subset --- p.67 / Chapter 3.5.5 --- Experiment 5: Comparison with Linear Regression --- p.74 / Chapter 3.6 --- Summary and Future Work --- p.76 / Chapter 4 --- Web Readability Analysis --- p.78 / Chapter 4.1 --- Web Page Readability --- p.79 / Chapter 4.1.1 --- Readability as Comprehension Difficulty . --- p.79 / Chapter 4.1.2 --- Readability as Grade Level --- p.81 / Chapter 4.2 --- Web Site Readability --- p.83 / Chapter 4.3 --- Experiments --- p.85 / Chapter 4.3.1 --- Experiment 1: Web Page Readability Analysis -Comprehension Difficulty --- p.87 / Chapter 4.3.2 --- Experiment 2: Web Page Readability Analysis -Grade Level --- p.92 / Chapter 4.3.3 --- Experiment 3: Web Site Readability Analysis --- p.98 / Chapter 4.4 --- Summary and Future Work --- p.101 / Chapter 5 --- Conclusion --- p.104 / Chapter A --- List of Symbols and Notations --- p.107 / Chapter B --- List of Publications --- p.110 / Bibliography --- p.113
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_325858 |
Date | January 2007 |
Contributors | Lau, Tak Pang., Chinese University of Hong Kong Graduate School. Division of Computer Science and Engineering. |
Source Sets | The Chinese University of Hong Kong |
Language | English, Chinese |
Detected Language | English |
Type | Text, bibliography |
Format | print, xiii, 122 leaves : ill. ; 30 cm. |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.0266 seconds