• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 54
  • 21
  • Tagged with
  • 56
  • 56
  • 56
  • 56
  • 29
  • 11
  • 11
  • 11
  • 10
  • 10
  • 10
  • 9
  • 8
  • 7
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Text compression for Chinese documents.

January 1995 (has links)
by Chi-kwun Kan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1995. / Includes bibliographical references (leaves 133-137). / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Importance of Text Compression --- p.1 / Chapter 1.2 --- Historical Background of Data Compression --- p.2 / Chapter 1.3 --- The Essences of Data Compression --- p.4 / Chapter 1.4 --- Motivation and Objectives of the Project --- p.5 / Chapter 1.5 --- Definition of Important Terms --- p.6 / Chapter 1.5.1 --- Data Models --- p.6 / Chapter 1.5.2 --- Entropy --- p.10 / Chapter 1.5.3 --- Statistical and Dictionary-based Compression --- p.12 / Chapter 1.5.4 --- Static and Adaptive Modelling --- p.12 / Chapter 1.5.5 --- One-Pass and Two-Pass Modelling --- p.13 / Chapter 1.6 --- Benchmarks and Measurements of Results --- p.15 / Chapter 1.7 --- Sources of Testing Data --- p.16 / Chapter 1.8 --- Outline of the Thesis --- p.16 / Chapter 2 --- Literature Survey --- p.18 / Chapter 2.1 --- Data compression Algorithms --- p.18 / Chapter 2.1.1 --- Statistical Compression Methods --- p.18 / Chapter 2.1.2 --- Dictionary-based Compression Methods (Ziv-Lempel Fam- ily) --- p.23 / Chapter 2.2 --- Cascading of Algorithms --- p.33 / Chapter 2.3 --- Problems of Current Compression Programs on Chinese --- p.34 / Chapter 2.4 --- Previous Chinese Data Compression Literatures --- p.37 / Chapter 3 --- Chinese-related Issues --- p.38 / Chapter 3.1 --- Characteristics in Chinese Data Compression --- p.38 / Chapter 3.1.1 --- Large and Not Fixed Size Character Set --- p.38 / Chapter 3.1.2 --- Lack of Word Segmentation --- p.40 / Chapter 3.1.3 --- Rich Semantic Meaning of Chinese Characters --- p.40 / Chapter 3.1.4 --- Grammatical Variance of Chinese Language --- p.41 / Chapter 3.2 --- Definition of Different Coding Schemes --- p.41 / Chapter 3.2.1 --- Big5 Code --- p.42 / Chapter 3.2.2 --- GB (Guo Biao) Code --- p.43 / Chapter 3.2.3 --- Unicode --- p.44 / Chapter 3.2.4 --- HZ (Hanzi) Code --- p.45 / Chapter 3.3 --- Entropy of Chinese and Other Languages --- p.45 / Chapter 4 --- Huffman Coding on Chinese Text --- p.49 / Chapter 4.1 --- The use of the Chinese Character Identification Routine --- p.50 / Chapter 4.2 --- Result --- p.51 / Chapter 4.3 --- Justification of the Result --- p.53 / Chapter 4.4 --- Time and Memory Resources Analysis --- p.58 / Chapter 4.5 --- The Heuristic Order-n Huffman Coding for Chinese Text Com- pression --- p.61 / Chapter 4.5.1 --- The Algorithm --- p.62 / Chapter 4.5.2 --- Result --- p.63 / Chapter 4.5.3 --- Justification of the Result --- p.64 / Chapter 4.6 --- Chapter Conclusion --- p.66 / Chapter 5 --- The Ziv-Lempel Compression on Chinese Text --- p.67 / Chapter 5.1 --- The Chinese LZSS Compression --- p.68 / Chapter 5.1.1 --- The Algorithm --- p.69 / Chapter 5.1.2 --- Result --- p.73 / Chapter 5.1.3 --- Justification of the Result --- p.74 / Chapter 5.1.4 --- Time and Memory Resources Analysis --- p.80 / Chapter 5.1.5 --- Effects in Controlling the Parameters --- p.81 / Chapter 5.2 --- The Chinese LZW Compression --- p.92 / Chapter 5.2.1 --- The Algorithm --- p.92 / Chapter 5.2.2 --- Result --- p.94 / Chapter 5.2.3 --- Justification of the Result --- p.95 / Chapter 5.2.4 --- Time and Memory Resources Analysis --- p.97 / Chapter 5.2.5 --- Effects in Controlling the Parameters --- p.98 / Chapter 5.3 --- A Comparison of the performance of the LZSS and the LZW --- p.100 / Chapter 5.4 --- Chapter Conclusion --- p.101 / Chapter 6 --- Chinese Dictionary-based Huffman coding --- p.103 / Chapter 6.1 --- The Algorithm --- p.104 / Chapter 6.2 --- Result --- p.107 / Chapter 6.3 --- Justification of the Result --- p.108 / Chapter 6.4 --- Effects of Changing the Size of the Dictionary --- p.111 / Chapter 6.5 --- Chapter Conclusion --- p.114 / Chapter 7 --- Cascading of Huffman coding and LZW compression --- p.116 / Chapter 7.1 --- Static Cascading Model --- p.117 / Chapter 7.1.1 --- The Algorithm --- p.117 / Chapter 7.1.2 --- Result --- p.120 / Chapter 7.1.3 --- Explanation and Analysis of the Result --- p.121 / Chapter 7.2 --- Adaptive (Dynamic) Cascading Model --- p.125 / Chapter 7.2.1 --- The Algorithm --- p.125 / Chapter 7.2.2 --- Result --- p.126 / Chapter 7.2.3 --- Explanation and Analysis of the Result --- p.127 / Chapter 7.3 --- Chapter Conclusion --- p.128 / Chapter 8 --- Concluding Remarks --- p.129 / Chapter 8.1 --- Conclusion --- p.129 / Chapter 8.2 --- Future Work Direction --- p.130 / Chapter 8.2.1 --- Improvement in Efficiency and Resources Consumption --- p.130 / Chapter 8.2.2 --- The Compressibility of Chinese and Other Languages --- p.131 / Chapter 8.2.3 --- Use of Grammar Model --- p.131 / Chapter 8.2.4 --- Lossy Compression --- p.131 / Chapter 8.3 --- Epilogue --- p.132 / Bibliography --- p.133
12

A comprehensive Chinese thesaurus system.

January 1995 (has links)
by Chen Hong Yi. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1995. / Includes bibliographical references (leaves 62-65). / Abstract --- p.ii / Acknowledgement --- p.iv / List of Tables --- p.viii / List of Figures --- p.ix / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Background Information And Thesis Scope --- p.6 / Chapter 2.1 --- Basic Concepts and Terminologies --- p.6 / Chapter 2.1.1 --- Semantic Classification Of A Word --- p.6 / Chapter 2.1.2 --- Relationship Link And Relationship Type --- p.7 / Chapter 2.1.3 --- "Semantic Closeness, Link Weight And Semantic Distance" --- p.8 / Chapter 2.1.4 --- Thesaurus Model And Semantic Net --- p.9 / Chapter 2.1.5 --- Thesaurus Building And Maintaining Tool --- p.9 / Chapter 2.2 --- Chinese Information Processing --- p.9 / Chapter 2.2.1 --- The Segmentation of Chinese Words --- p.10 / Chapter 2.2.2 --- The Ambiguity of Chinese Words --- p.10 / Chapter 2.2.3 --- Multiple Chinese Character Code Set Standards --- p.11 / Chapter 2.3 --- Related Work --- p.11 / Chapter 2.4 --- Thesis Scope --- p.13 / Chapter 3 --- System Design Principles --- p.15 / Chapter 3.1 --- Application Context Of TheSys --- p.15 / Chapter 3.2 --- Overall System Architecture --- p.16 / Chapter 3.3 --- Entry-Term Construct And Thesaurus Frame --- p.19 / Chapter 3.3.1 --- "Words, Entry Terms And Entry Term Construct" --- p.21 / Chapter 3.3.2 --- "Semanteme, Relationship And Thesaurus Frame" --- p.23 / Chapter 3.3.3 --- Dealing With Term Ambiguity --- p.28 / Chapter 3.4 --- Weighting Scheme --- p.33 / Chapter 3.4.1 --- Assumption --- p.33 / Chapter 3.4.2 --- Quantify The Relevancy Between Two Directly Linked Concepts --- p.34 / Chapter 3.4.3 --- Quantify The Relevancy Between Two Indirectly Linked Concepts --- p.35 / Chapter 3.5 --- Term Ranking --- p.38 / Chapter 3.6 --- Thesaurus Module and Maintenance Module --- p.39 / Chapter 3.6.1 --- The Procedure Of Building A Thesaurus --- p.40 / Chapter 3.6.2 --- Thesaurus Nomination --- p.41 / Chapter 3.6.3 --- Semantic Classification Tree Construction --- p.41 / Chapter 3.6.4 --- Relation Type Definition --- p.42 / Chapter 3.6.5 --- Entry Term Construct Construction --- p.42 / Chapter 3.6.6 --- Thesaurus Frame Construction --- p.43 / Chapter 3.6.7 --- Thesaurus Query --- p.44 / Chapter 4 --- System Implementation --- p.45 / Chapter 4.1 --- Data Structure --- p.45 / Chapter 4.1.1 --- Entry Term Construct --- p.45 / Chapter 4.1.2 --- Thesaurus Frame --- p.49 / Chapter 4.2 --- API --- p.50 / Chapter 4.3 --- User Interface --- p.54 / Chapter 4.3.1 --- Widget And Its Callback --- p.54 / Chapter 4.3.2 --- Bilingual User Interface --- p.55 / Chapter 4.3.3 --- Chinese Character Input Method --- p.57 / Chapter 5 --- Conclusion And Future Work --- p.60 / Chapter A --- System Installation --- p.66 / Chapter A.1 --- Files In TheSys --- p.67 / Chapter A.2 --- Employ TheSys As Application Package --- p.70 / Chapter A.3 --- Set Up TheSys With UI --- p.71 / Chapter A.4 --- Verify The Word Using External Dictionary --- p.74 / Chapter B --- API Description --- p.77 / Chapter B.1 --- thesys.h File --- p.77 / Chapter B.2 --- API Reference --- p.82 / Chapter C --- User Interface Reference --- p.108
13

Hybrid tag-set for natural language processing.

January 1999 (has links)
Leung Wai Kwong. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1999. / Includes bibliographical references (leaves 90-95). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Motivation --- p.1 / Chapter 1.2 --- Objective --- p.3 / Chapter 1.3 --- Organization of thesis --- p.3 / Chapter 2 --- Background --- p.5 / Chapter 2.1 --- Chinese Noun Phrases Parsing --- p.5 / Chapter 2.2 --- Chinese Noun Phrases --- p.6 / Chapter 2.3 --- Problems with Syntactic Parsing --- p.11 / Chapter 2.3.1 --- Conjunctive Noun Phrases --- p.11 / Chapter 2.3.2 --- De-de Noun Phrases --- p.12 / Chapter 2.3.3 --- Compound Noun Phrases --- p.13 / Chapter 2.4 --- Observations --- p.15 / Chapter 2.4.1 --- Inadequacy in Part-of-Speech Categorization for Chi- nese NLP --- p.16 / Chapter 2.4.2 --- The Need of Semantic in Noun Phrase Parsing --- p.17 / Chapter 2.5 --- Summary --- p.17 / Chapter 3 --- Hybrid Tag-set --- p.19 / Chapter 3.1 --- Objectives --- p.19 / Chapter 3.1.1 --- Resolving Parsing Ambiguities --- p.19 / Chapter 3.1.2 --- Investigation of Nominal Compound Noun Phrases --- p.20 / Chapter 3.2 --- Definition of Hybrid Tag-set --- p.20 / Chapter 3.3 --- Introduction to Cilin --- p.21 / Chapter 3.4 --- Problems with Cilin --- p.23 / Chapter 3.4.1 --- Unknown words --- p.23 / Chapter 3.4.2 --- Multiple Semantic Classes --- p.25 / Chapter 3.5 --- Introduction to Chinese Word Formation --- p.26 / Chapter 3.5.1 --- Disyllabic Word Formation --- p.26 / Chapter 3.5.2 --- Polysyllabic Word Formation --- p.28 / Chapter 3.5.3 --- Observation --- p.29 / Chapter 3.6 --- Automatic Assignment of Hybrid Tag to Chinese Word --- p.31 / Chapter 3.7 --- Summary --- p.34 / Chapter 4 --- Automatic Semantic Assignment --- p.35 / Chapter 4.1 --- Previous Researches on Semantic Tagging --- p.36 / Chapter 4.2 --- SAUW - Automatic Semantic Assignment of Unknown Words --- p.37 / Chapter 4.2.1 --- POS-to-SC Association (Process 1) --- p.38 / Chapter 4.2.2 --- Morphology-based Deduction (Process 2) --- p.39 / Chapter 4.2.3 --- Di-syllabic Word Analysis (Process 3 and 4) --- p.41 / Chapter 4.2.4 --- Poly-syllabic Word Analysis (Process 5) --- p.47 / Chapter 4.3 --- Illustrative Examples --- p.47 / Chapter 4.4 --- Evaluation and Analysis --- p.49 / Chapter 4.4.1 --- Experiments --- p.49 / Chapter 4.4.2 --- Error Analysis --- p.51 / Chapter 4.5 --- Summary --- p.52 / Chapter 5 --- Word Sense Disambiguation --- p.53 / Chapter 5.1 --- Introduction to Word Sense Disambiguation --- p.54 / Chapter 5.2 --- Previous Works on Word Sense Disambiguation --- p.55 / Chapter 5.2.1 --- Linguistic-based Approaches --- p.56 / Chapter 5.2.2 --- Corpus-based Approaches --- p.58 / Chapter 5.3 --- Our Approach --- p.60 / Chapter 5.3.1 --- Bi-gram Co-occurrence Probabilities --- p.62 / Chapter 5.3.2 --- Tri-gram Co-occurrence Probabilities --- p.63 / Chapter 5.3.3 --- Design consideration --- p.65 / Chapter 5.3.4 --- Error Analysis --- p.67 / Chapter 5.4 --- Summary --- p.68 / Chapter 6 --- Hybrid Tag-set for Chinese Noun Phrase Parsing --- p.69 / Chapter 6.1 --- Resolving Ambiguous Noun Phrases --- p.70 / Chapter 6.1.1 --- Experiment --- p.70 / Chapter 6.1.2 --- Results --- p.72 / Chapter 6.2 --- Summary --- p.78 / Chapter 7 --- Conclusion --- p.80 / Chapter 7.1 --- Summary --- p.80 / Chapter 7.2 --- Difficulties Encountered --- p.83 / Chapter 7.2.1 --- Lack of Training Corpus --- p.83 / Chapter 7.2.2 --- Features of Chinese word formation --- p.84 / Chapter 7.2.3 --- Problems with linguistic sources --- p.85 / Chapter 7.3 --- Contributions --- p.86 / Chapter 7.3.1 --- Enrichment to the Cilin --- p.86 / Chapter 7.3.2 --- Enhancement in syntactic parsing --- p.87 / Chapter 7.4 --- Further Researches --- p.88 / Chapter 7.4.1 --- Investigation into words that undergo semantic changes --- p.88 / Chapter 7.4.2 --- Incorporation of more information into the hybrid tag-set --- p.89 / Chapter A --- POS Tag-set by Tsinghua University (清華大學) --- p.96 / Chapter B --- Morphological Rules --- p.100 / Chapter C --- Syntactic Rules for Di-syllabic Words Formation --- p.104
14

Linguistic constraints for large vocabulary speech recognition.

January 1999 (has links)
by Roger H.Y. Leung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1999. / Includes bibliographical references (leaves 79-84). / Abstracts in English and Chinese. / ABSTRACT --- p.I / Keywords: --- p.I / ACKNOWLEDGEMENTS --- p.III / TABLE OF CONTENTS: --- p.IV / Table of Figures: --- p.VI / Table of Tables: --- p.VII / Chapter CHAPTER 1 --- INTRODUCTION --- p.1 / Chapter 1.1 --- Languages in the World --- p.2 / Chapter 1.2 --- Problems of Chinese Speech Recognition --- p.3 / Chapter 1.2.1 --- Unlimited word size: --- p.3 / Chapter 1.2.2 --- Too many Homophones: --- p.3 / Chapter 1.2.3 --- Difference between spoken and written Chinese: --- p.3 / Chapter 1.2.4 --- Word Segmentation Problem: --- p.4 / Chapter 1.3 --- Different types of knowledge --- p.5 / Chapter 1.4 --- Chapter Conclusion --- p.6 / Chapter CHAPTER 2 --- FOUNDATIONS --- p.7 / Chapter 2.1 --- Chinese Phonology and Language Properties --- p.7 / Chapter 2.1.1 --- Basic Syllable Structure --- p.7 / Chapter 2.2 --- Acoustic Models --- p.9 / Chapter 2.2.1 --- Acoustic Unit --- p.9 / Chapter 2.2.2 --- Hidden Markov Model (HMM) --- p.9 / Chapter 2.3 --- Search Algorithm --- p.11 / Chapter 2.4 --- Statistical Language Models --- p.12 / Chapter 2.4.1 --- Context-Independent Language Model --- p.12 / Chapter 2.4.2 --- Word-Pair Language Model --- p.13 / Chapter 2.4.3 --- N-gram Language Model --- p.13 / Chapter 2.4.4 --- Backoff n-gram --- p.14 / Chapter 2.5 --- Smoothing for Language Model --- p.16 / Chapter CHAPTER 3 --- LEXICAL ACCESS --- p.18 / Chapter 3.1 --- Introduction --- p.18 / Chapter 3.2 --- Motivation: Phonological and lexical constraints --- p.20 / Chapter 3.3 --- Broad Classes Representation --- p.22 / Chapter 3.4 --- Broad Classes Statistic Measures --- p.25 / Chapter 3.5 --- Broad Classes Frequency Normalization --- p.26 / Chapter 3.6 --- Broad Classes Analysis --- p.27 / Chapter 3.7 --- Isolated Word Speech Recognizer using Broad Classes --- p.33 / Chapter 3.8 --- Chapter Conclusion --- p.34 / Chapter CHAPTER 4 --- CHARACTER AND WORD LANGUAGE MODEL --- p.35 / Chapter 4.1 --- Introduction --- p.35 / Chapter 4.2 --- Motivation --- p.36 / Chapter 4.2.1 --- Perplexity --- p.36 / Chapter 4.3 --- Call Home Mandarin corpus --- p.38 / Chapter 4.3.1 --- Acoustic Data --- p.38 / Chapter 4.3.2 --- Transcription Texts --- p.39 / Chapter 4.4 --- Methodology: Building Language Model --- p.41 / Chapter 4.5 --- Character Level Language Model --- p.45 / Chapter 4.6 --- Word Level Language Model --- p.48 / Chapter 4.7 --- Comparison of Character level and Word level Language Model --- p.50 / Chapter 4.8 --- Interpolated Language Model --- p.54 / Chapter 4.8.1 --- Methodology --- p.54 / Chapter 4.8.2 --- Experiment Results --- p.55 / Chapter 4.9 --- Chapter Conclusion --- p.56 / Chapter CHAPTER 5 --- N-GRAM SMOOTHING --- p.57 / Chapter 5.1 --- Introduction --- p.57 / Chapter 5.2 --- Motivation --- p.58 / Chapter 5.3 --- Mathematical Representation --- p.59 / Chapter 5.4 --- Methodology: Smoothing techniques --- p.61 / Chapter 5.4.1 --- Add-one Smoothing --- p.62 / Chapter 5.4.2 --- Witten-Bell Discounting --- p.64 / Chapter 5.4.3 --- Good Turing Discounting --- p.66 / Chapter 5.4.4 --- Absolute and Linear Discounting --- p.68 / Chapter 5.5 --- Comparison of Different Discount Methods --- p.70 / Chapter 5.6 --- Continuous Word Speech Recognizer --- p.71 / Chapter 5.6.1 --- Experiment Setup --- p.71 / Chapter 5.6.2 --- Experiment Results: --- p.72 / Chapter 5.7 --- Chapter Conclusion --- p.74 / Chapter CHAPTER 6 --- SUMMARY AND CONCLUSIONS --- p.75 / Chapter 6.1 --- Summary --- p.75 / Chapter 6.2 --- Further Work --- p.77 / Chapter 6.3 --- Conclusion --- p.78 / REFERENCE --- p.79
15

Domain-optimized Chinese speech generation.

January 2001 (has links)
Fung Tien Ying. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2001. / Includes bibliographical references (leaves 119-128). / Abstracts in English and Chinese. / Abstract --- p.1 / Acknowledgement --- p.1 / List of Figures --- p.7 / List of Tables --- p.11 / Chapter 1 --- Introduction --- p.14 / Chapter 1.1 --- General Trends on Speech Generation --- p.15 / Chapter 1.2 --- Domain-Optimized Speech Generation in Chinese --- p.16 / Chapter 1.3 --- Thesis Organization --- p.17 / Chapter 2 --- Background --- p.19 / Chapter 2.1 --- Linguistic and Phonological Properties of Chinese --- p.19 / Chapter 2.1.1 --- Articulation --- p.20 / Chapter 2.1.2 --- Tones --- p.21 / Chapter 2.2 --- Previous Development in Speech Generation --- p.22 / Chapter 2.2.1 --- Articulatory Synthesis --- p.23 / Chapter 2.2.2 --- Formant Synthesis --- p.24 / Chapter 2.2.3 --- Concatenative Synthesis --- p.25 / Chapter 2.2.4 --- Existing Systems --- p.31 / Chapter 2.3 --- Our Speech Generation Approach --- p.35 / Chapter 3 --- Corpus-based Syllable Concatenation: A Feasibility Test --- p.37 / Chapter 3.1 --- Capturing Syllable Coarticulation with Distinctive Features --- p.39 / Chapter 3.2 --- Creating a Domain-Optimized Wavebank --- p.41 / Chapter 3.2.1 --- Generate-and-Filter --- p.44 / Chapter 3.2.2 --- Waveform Segmentation --- p.47 / Chapter 3.3 --- The Use of Multi-Syllable Units --- p.49 / Chapter 3.4 --- Unit Selection for Concatenative Speech Output --- p.50 / Chapter 3.5 --- A Listening Test --- p.51 / Chapter 3.6 --- Chapter Summary --- p.52 / Chapter 4 --- Scalability and Portability to the Stocks Domain --- p.55 / Chapter 4.1 --- Complexity of the ISIS Responses --- p.56 / Chapter 4.2 --- XML for input semantic and grammar representation --- p.60 / Chapter 4.3 --- Tree-Based Filtering Algorithm --- p.63 / Chapter 4.4 --- Energy Normalization --- p.67 / Chapter 4.5 --- Chapter Summary --- p.69 / Chapter 5 --- Investigation in Tonal Contexts --- p.71 / Chapter 5.1 --- The Nature of Tones --- p.74 / Chapter 5.1.1 --- Human Perception of Tones --- p.75 / Chapter 5.2 --- Relative Importance of Left and Right Tonal Context --- p.77 / Chapter 5.2.1 --- Tonal Contexts in the Date-Time Subgrammar --- p.77 / Chapter 5.2.2 --- Tonal Contexts in the Numeric Subgrammar --- p.82 / Chapter 5.2.3 --- Conclusion regarding the Relative Importance of Left versus Right Tonal Contexts --- p.86 / Chapter 5.3 --- Selection Scheme for Tonal Variants --- p.86 / Chapter 5.3.1 --- Listening Test for our Tone Backoff Scheme --- p.90 / Chapter 5.3.2 --- Error Analysis --- p.92 / Chapter 5.4 --- Chapter Summary --- p.94 / Chapter 6 --- Summary and Future Work --- p.95 / Chapter 6.1 --- Contributions --- p.97 / Chapter 6.2 --- Future Directions --- p.98 / Chapter A --- Listening Test Questionnaire for FOREX Response Genera- tion --- p.100 / Chapter B --- Major Response Types For ISIS --- p.102 / Chapter C --- Recording Corpus for Tone Investigation in Date-time Sub- grammar --- p.105 / Chapter D --- Statistical Test for Left Tonal Context --- p.109 / Chapter E --- Statistical Test for Right Tonal Context --- p.112 / Chapter F --- Listening Test Questionnaire for Backoff Unit Selection Scheme --- p.115 / Chapter G --- Statistical Test for the Backoff Unit Selection Scheme --- p.117 / Chapter H --- Statistical Test for the Backoff Unit Selection Scheme --- p.118 / Bibliography --- p.119
16

An investigation on Chinese noun phrase extraction.

January 2000 (has links)
Chan Kun-Chung Timothy. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 79-83). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Motivation --- p.1 / Chapter 1.2 --- Outline of Thesis --- p.3 / Chapter 2 --- Background --- p.5 / Chapter 2.1 --- Chinese Noun Phrase Structure --- p.5 / Chapter 2.2 --- Literature Review --- p.6 / Chapter 2.3 --- Observations --- p.10 / Chapter 2.4 --- Chapter Summary --- p.11 / Chapter 3 --- Maximal Chinese Noun Phrase Extraction System --- p.13 / Chapter 3.1 --- Background --- p.13 / Chapter 3.1.1 --- Part-of-speech Tagset --- p.13 / Chapter 3.1.2 --- The Tagging System --- p.14 / Chapter 3.1.3 --- Chinese Corpus --- p.16 / Chapter 3.1.4 --- Grammar Rules and Boundary Information --- p.17 / Chapter 3.1.5 --- Feature Selection --- p.19 / Chapter 3.2 --- Overview of Our Chinese Noun Phrase Extraction System --- p.19 / Chapter 3.2.1 --- Training --- p.19 / Chapter 3.2.2 --- Testing --- p.21 / Chapter 3.3 --- Chapter Summary --- p.21 / Chapter 4 --- Preliminary Noun Phrase Extraction --- p.23 / Chapter 4.1 --- Framework --- p.23 / Chapter 4.2 --- Boundary Information Acquisition --- p.24 / Chapter 4.3 --- Candidate Boundary Insertion --- p.26 / Chapter 4.4 --- Pairing of Candidate Boundaries --- p.27 / Chapter 4.4.1 --- Conditional Probability-based Model --- p.28 / Chapter 4.4.2 --- Heuristic-based Model --- p.29 / Chapter 4.4.3 --- Dynamic Programming-based Model --- p.30 / Chapter 4.4.4 --- Model Selection --- p.31 / Chapter 4.4.5 --- Revised Dynamic Programming Model --- p.32 / Chapter 4.4.6 --- Analysis of the Impact of the Revised DP Model --- p.35 / Chapter 4.4.7 --- Experiments of Dynamic Programming-based Model --- p.38 / Chapter 4.4.8 --- Result Analysis --- p.42 / Chapter 4.5 --- Concluding Remarks on DP-Based Model --- p.47 / Chapter 4.6 --- Chapter Summary --- p.49 / Chapter 5 --- Automatic Error Correction --- p.50 / Chapter 5.1 --- Introduction --- p.50 / Chapter 5.1.1 --- Statistical Properties of TEL --- p.54 / Chapter 5.1.2 --- Related Applications --- p.55 / Chapter 5.2 --- Settings of Main Components --- p.57 / Chapter 5.2.1 --- Initial State --- p.58 / Chapter 5.2.2 --- Transformation Actions --- p.58 / Chapter 5.2.3 --- Triggering Features of Transformation Templates --- p.58 / Chapter 5.2.4 --- Evaluation of Rule --- p.62 / Chapter 5.2.5 --- Stopping Threshold --- p.62 / Chapter 5.3 --- Experiments and Results --- p.63 / Chapter 5.3.1 --- Setup and Procedure --- p.63 / Chapter 5.3.2 --- Overall Performance --- p.63 / Chapter 5.3.3 --- Contribution of Rules --- p.67 / Chapter 5.3.4 --- Remarks on Rules Learning --- p.69 / Chapter 5.3.5 --- Discussion on Recall Performance --- p.70 / Chapter 5.4 --- Chapter Summary --- p.73 / Chapter 6 --- Conclusion --- p.74 / Chapter 6.1 --- Summary --- p.74 / Chapter 6.2 --- Contributions --- p.76 / Chapter 6.3 --- Future Work --- p.76 / Bibliography --- p.79 / Chapter A --- Chinese POS Tag Set --- p.84 / Chapter B --- Algorithms of Boundary Pairing Models --- p.88 / Chapter B.1 --- Heuristic based Model --- p.88 / Chapter B.2 --- Dynamic Programming based Model --- p.89 / Chapter C --- Triggering Environments of Transformation Templates --- p.91
17

A generic Chinese PAT tree data structure for Chinese documents clustering.

January 2002 (has links)
Kwok Chi Leong. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2002. / Includes bibliographical references (leaves 122-127). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgment --- p.vi / Table of Contents --- p.vii / List of Tables --- p.x / List of Figures --- p.xi / Chapter Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Contributions --- p.2 / Chapter 1.2 --- Thesis Overview --- p.3 / Chapter Chapter 2 --- Background Information --- p.5 / Chapter 2.1 --- Documents Clustering --- p.5 / Chapter 2.1.1 --- Review of Clustering Techniques --- p.5 / Chapter 2.1.2 --- Suffix Tree Clustering --- p.7 / Chapter 2.2 --- Chinese Information Processing --- p.8 / Chapter 2.2.1 --- Sentence Segmentation --- p.8 / Chapter 2.2.2 --- Keyword Extraction --- p.10 / Chapter Chapter 3 --- The Generic Chinese PAT Tree --- p.12 / Chapter 3.1 --- PAT Tree --- p.13 / Chapter 3.1.1 --- Patricia Tree --- p.13 / Chapter 3.1.2 --- Semi-Infinite String --- p.14 / Chapter 3.1.3 --- Structure of Tree Nodes --- p.17 / Chapter 3.1.4 --- Some Examples of PAT Tree --- p.22 / Chapter 3.1.5 --- Storage Complexity --- p.24 / Chapter 3.2 --- The Chinese PAT Tree --- p.26 / Chapter 3.2.1 --- The Chinese PAT Tree Structure --- p.26 / Chapter 3.2.2 --- Some Examples of Chinese PAT Tree --- p.30 / Chapter 3.2.3 --- Storage Complexity --- p.33 / Chapter 3.3 --- The Generic Chinese PAT Tree --- p.34 / Chapter 3.3.1 --- Structure Overview --- p.34 / Chapter 3.3.2 --- Structure of Tree Nodes --- p.35 / Chapter 3.3.3 --- Essential Node --- p.37 / Chapter 3.3.4 --- Some Examples of the Generic Chinese PAT Tree --- p.41 / Chapter 3.3.5 --- Storage Complexity --- p.45 / Chapter 3.4 --- Problems of Embedded Nodes --- p.46 / Chapter 3.4.1 --- The Reduced Structure --- p.47 / Chapter 3.4.2 --- Disadvantages of Reduced Structure --- p.48 / Chapter 3.4.3 --- A Case Study of Reduced Design --- p.50 / Chapter 3.4.4 --- Experiments on Frequency Mismatch --- p.51 / Chapter 3.5 --- Strengths of the Generic Chinese PAT Tree --- p.55 / Chapter Chapter 4 --- Performance Analysis on the Generic Chinese PAT Tree --- p.58 / Chapter 4.1 --- The Construction of the Generic Chinese PAT Tree --- p.59 / Chapter 4.2 --- Counting the Essential Nodes --- p.61 / Chapter 4.3 --- Performance of Various PAT Trees --- p.62 / Chapter 4.4 --- The Implementation Analysis --- p.64 / Chapter 4.4.1 --- Pure Dynamic Memory Allocation --- p.64 / Chapter 4.4.2 --- Node Production Factory Approach --- p.66 / Chapter 4.4.3 --- Experiment Result of the Factory Approach --- p.68 / Chapter Chapter 5 --- The Chinese Documents Clustering --- p.70 / Chapter 5.1 --- The Clustering Framework --- p.70 / Chapter 5.1.1 --- Documents Cleaning --- p.73 / Chapter 5.1.2 --- PAT Tree Construction --- p.76 / Chapter 5.1.3 --- Essential Node Extraction --- p.77 / Chapter 5.1.4 --- Base Clusters Detection --- p.80 / Chapter 5.1.5 --- Base Clusters Filtering --- p.86 / Chapter 5.1.6 --- Base Clusters Combining --- p.94 / Chapter 5.1.7 --- Documents Assigning --- p.95 / Chapter 5.1.8 --- Result Presentation --- p.96 / Chapter 5.2 --- Discussion --- p.96 / Chapter 5.2.1 --- Flexibility of Our Framework --- p.96 / Chapter 5.2.2 --- Our Clustering Model --- p.97 / Chapter 5.2.3 --- More About Clusters Detection --- p.98 / Chapter 5.2.4 --- Analysis and Complexity --- p.100 / Chapter Chapter 6 --- Evaluations on the Chinese Documents Clustering --- p.101 / Chapter 6.1 --- Details of Experiment --- p.101 / Chapter 6.1.1 --- Parameter of Weighted Frequency --- p.105 / Chapter 6.1.2 --- Effect of CLP Analysis --- p.105 / Chapter 6.1.3 --- Result of Clustering --- p.108 / Chapter 6.2 --- Clustering on Larger Collection --- p.109 / Chapter 6.2.1 --- Comparing the Base Clusters --- p.109 / Chapter 6.2.2 --- Result of Clustering --- p.111 / Chapter 6.2.3 --- Discussion --- p.112 / Chapter 6.3 --- Clustering with Part of Documents --- p.113 / Chapter 6.3.1 --- Clustering with News Headlines --- p.114 / Chapter 6.3.2 --- Clustering with News Abstract --- p.117 / Chapter Chapter 7 --- Conclusion --- p.119 / Bibliography --- p.122
18

Automatic noun phrase extraction from full Chinese text. / CUHK electronic theses & dissertations collection

January 1997 (has links)
by Li Wenjie. / Thesis (Ph.D.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references (p. 209-226). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Mode of access: World Wide Web.
19

A natural language based indexing technique for Chinese information retrieval.

January 1997 (has links)
Pang Chun Kiu. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references (leaves 101-107). / Chapter 1 --- Introduction --- p.2 / Chapter 1.1 --- Chinese Indexing using Noun Phrases --- p.6 / Chapter 1.2 --- Objectives --- p.8 / Chapter 1.3 --- An Overview of the Thesis --- p.8 / Chapter 2 --- Background --- p.10 / Chapter 2.1 --- Technology Influences on Information Retrieval --- p.10 / Chapter 2.2 --- Related Work --- p.13 / Chapter 2.2.1 --- Statistical/Keyword Approaches --- p.13 / Chapter 2.2.2 --- Syntactical approaches --- p.15 / Chapter 2.2.3 --- Semantic approaches --- p.17 / Chapter 2.2.4 --- Noun Phrases Approach --- p.18 / Chapter 2.2.5 --- Chinese Information Retrieval --- p.20 / Chapter 2.3 --- Our Approach --- p.21 / Chapter 3 --- Chinese Noun Phrases --- p.23 / Chapter 3.1 --- Different types of Chinese Noun Phrases --- p.23 / Chapter 3.2 --- Ambiguous noun phrases --- p.27 / Chapter 3.2.1 --- Ambiguous English Noun Phrases --- p.27 / Chapter 3.2.2 --- Ambiguous Chinese Noun Phrases --- p.28 / Chapter 3.2.3 --- Statistical data on the three NPs --- p.33 / Chapter 4 --- Index Extraction from De-de Conj. NP --- p.35 / Chapter 4.1 --- Word Segmentation --- p.36 / Chapter 4.2 --- Part-of-speech tagging --- p.37 / Chapter 4.3 --- Noun Phrase Extraction --- p.37 / Chapter 4.4 --- The Chinese noun phrase partial parser --- p.38 / Chapter 4.5 --- Handling Parsing Ambiguity --- p.40 / Chapter 4.6 --- Index Building Strategy --- p.41 / Chapter 4.7 --- The cross-set generation rules --- p.44 / Chapter 4.8 --- Example 1: Indexing De-de NP --- p.46 / Chapter 4.9 --- Example 2: Indexing Conjunctive NP --- p.48 / Chapter 4.10 --- Experimental results and Discussion --- p.49 / Chapter 5 --- Indexing Compound Nouns --- p.52 / Chapter 5.1 --- Previous Researches on Compound Nouns --- p.53 / Chapter 5.2 --- Indexing two-term Compound Nouns --- p.55 / Chapter 5.2.1 --- About the thesaurus《同義詞詞林》 --- p.56 / Chapter 5.3 --- Indexing Compound Nouns of three or more terms --- p.58 / Chapter 5.4 --- Corpus learning approach --- p.59 / Chapter 5.4.1 --- An Example --- p.60 / Chapter 5.4.2 --- Experimental Setup --- p.63 / Chapter 5.4.3 --- An Experiment using the third level of the Cilin --- p.65 / Chapter 5.4.4 --- An Experiment using the second level of the Cilin --- p.66 / Chapter 5.5 --- Contextual Approach --- p.68 / Chapter 5.5.1 --- The algorithm --- p.69 / Chapter 5.5.2 --- An Illustrative Example --- p.71 / Chapter 5.5.3 --- Experiments on compound nouns --- p.72 / Chapter 5.5.4 --- Experiment I: Word Distance Based Extraction --- p.73 / Chapter 5.5.5 --- Experiment II: Semantic Class Based Extraction --- p.75 / Chapter 5.5.6 --- Experiments III: On different boundaries --- p.76 / Chapter 5.5.7 --- The Final Algorithm --- p.79 / Chapter 5.5.8 --- Experiments on other compounds --- p.82 / Chapter 5.5.9 --- Discussion --- p.83 / Chapter 6 --- Overall Effectiveness --- p.85 / Chapter 6.1 --- Illustrative Example for the Integrated Algorithm --- p.86 / Chapter 6.2 --- Experimental Setup --- p.90 / Chapter 6.3 --- Experimental Results & Discussion --- p.91 / Chapter 7 --- Conclusion --- p.95 / Chapter 7.1 --- Summary --- p.95 / Chapter 7.2 --- Contributions --- p.97 / Chapter 7.3 --- Future Directions --- p.98 / Chapter 7.3.1 --- Word-sense determination --- p.98 / Chapter 7.3.2 --- Hybrid approach for compound noun indexing --- p.99 / Chapter A --- Cross-set Generation Rules --- p.108 / Chapter B --- Tag set by Tsinghua University --- p.110 / Chapter C --- Noun Phrases Test Set --- p.113 / Chapter D --- Compound Nouns Test Set --- p.124 / Chapter D.l --- Three-term Compound Nouns --- p.125 / Chapter D.1.1 --- NVN --- p.125 / Chapter D.1.2 --- Other three-term compound nouns --- p.129 / Chapter D.2 --- Four-term Compound Nouns --- p.133 / Chapter D.3 --- Five-term and six-term Compound Nouns --- p.134
20

Chinese information access through internet on X-open system.

January 1997 (has links)
by Yao Jian. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references (leaves 109-112). / Abstract --- p.i / Acknowledgments --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Basic Concepts And Related Work --- p.6 / Chapter 2.1 --- Codeset and Codeset Conversion --- p.7 / Chapter 2.2 --- HTML Language --- p.10 / Chapter 2.3 --- HTTP Protocol --- p.13 / Chapter 2.4 --- I18N And LION --- p.18 / Chapter 2.5 --- Proxy Server --- p.19 / Chapter 2.6 --- Related Work --- p.20 / Chapter 3 --- Design Principles And System Architecture --- p.23 / Chapter 3.1 --- Use of Existing Web System --- p.23 / Chapter 3.1.1 --- Protocol --- p.23 / Chapter 3.1.2 --- Avoid Duplication of Documents for Different Codesets --- p.25 / Chapter 3.1.3 --- Support On-line Codeset Conversion Facility --- p.27 / Chapter 3.1.4 --- Provide Internationalized Interface of Web Browser --- p.28 / Chapter 3.2 --- Our Approach --- p.29 / Chapter 3.2.1 --- Enhancing the Existing Browsers and Servers --- p.30 / Chapter 3.2.2 --- Incorporating Proxies in Our Scheme --- p.32 / Chapter 3.2.3 --- Automatic Codeset Conversion --- p.34 / Chapter 3.3 --- Overall System Architecture --- p.38 / Chapter 3.3.1 --- Architecture of Our Web System --- p.38 / Chapter 3.3.2 --- Flexibility of Our Design --- p.40 / Chapter 3.3.3 --- Which side do the codeset conversion? --- p.42 / Chapter 3.3.4 --- Caching --- p.42 / Chapter 4 --- Design Details of An Enhanced Server --- p.44 / Chapter 4.1 --- Architecture of The Enhanced Server --- p.44 / Chapter 4.2 --- Procedure on Processing Client's Request --- p.45 / Chapter 4.3 --- Modifications of The Enhanced Server --- p.48 / Chapter 4.3.1 --- Interpretation of Client's Codeset Announcement --- p.48 / Chapter 4.3.2 --- Codeset Identification of Web Documents on the Server --- p.49 / Chapter 4.3.3 --- Codeset Notification to the Web Client --- p.52 / Chapter 4.3.4 --- Codeset Conversion --- p.54 / Chapter 4.4 --- Experiment Results --- p.54 / Chapter 5 --- Design Details of An Enhanced Browser --- p.58 / Chapter 5.1 --- Architecture of The Enhanced Browser --- p.58 / Chapter 5.2 --- Procedure on Processing Users' Requests --- p.61 / Chapter 5.3 --- Event Management and Handling --- p.63 / Chapter 5.3.1 --- Basic Control Flow of the Browser --- p.63 / Chapter 5.3.2 --- Event Handlers --- p.64 / Chapter 5.4 --- Internationalization of Browser Interface --- p.75 / Chapter 5.4.1 --- Locale --- p.76 / Chapter 5.4.2 --- Resource File --- p.77 / Chapter 5.4.3 --- Message Catalog System --- p.79 / Chapter 5.5 --- Experiment Result --- p.85 / Chapter 6 --- Another Scheme - CGI --- p.89 / Chapter 6.1 --- Form and CGI --- p.90 / Chapter 6.2 --- CGI Control Flow --- p.96 / Chapter 6.3 --- Automatic Codeset Detection --- p.96 / Chapter 6.3.1 --- Analysis of code range for GB and Big5 --- p.98 / Chapter 6.3.2 --- Control Flow of Automatic Codeset Detection --- p.99 / Chapter 6.4 --- Experiment Results --- p.101 / Chapter 7 --- Conclusions and Future Work --- p.104 / Chapter 7.1 --- Current Status --- p.105 / Chapter 7.2 --- System Efficiency --- p.106 / Chapter 7.3 --- Future Work --- p.107 / Bibliography --- p.109 / Chapter A --- Programmer's Guide --- p.113 / Chapter A.1 --- Data Structure --- p.113 / Chapter A.2 --- Calling Sequence of Functions --- p.114 / Chapter A.3 --- Modification of Souce Code --- p.116 / Chapter A.4 --- Modification of Resources --- p.133 / Chapter B --- User Manual --- p.135

Page generated in 0.1469 seconds