by Chi-kwun Kan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1995. / Includes bibliographical references (leaves 133-137). / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Importance of Text Compression --- p.1 / Chapter 1.2 --- Historical Background of Data Compression --- p.2 / Chapter 1.3 --- The Essences of Data Compression --- p.4 / Chapter 1.4 --- Motivation and Objectives of the Project --- p.5 / Chapter 1.5 --- Definition of Important Terms --- p.6 / Chapter 1.5.1 --- Data Models --- p.6 / Chapter 1.5.2 --- Entropy --- p.10 / Chapter 1.5.3 --- Statistical and Dictionary-based Compression --- p.12 / Chapter 1.5.4 --- Static and Adaptive Modelling --- p.12 / Chapter 1.5.5 --- One-Pass and Two-Pass Modelling --- p.13 / Chapter 1.6 --- Benchmarks and Measurements of Results --- p.15 / Chapter 1.7 --- Sources of Testing Data --- p.16 / Chapter 1.8 --- Outline of the Thesis --- p.16 / Chapter 2 --- Literature Survey --- p.18 / Chapter 2.1 --- Data compression Algorithms --- p.18 / Chapter 2.1.1 --- Statistical Compression Methods --- p.18 / Chapter 2.1.2 --- Dictionary-based Compression Methods (Ziv-Lempel Fam- ily) --- p.23 / Chapter 2.2 --- Cascading of Algorithms --- p.33 / Chapter 2.3 --- Problems of Current Compression Programs on Chinese --- p.34 / Chapter 2.4 --- Previous Chinese Data Compression Literatures --- p.37 / Chapter 3 --- Chinese-related Issues --- p.38 / Chapter 3.1 --- Characteristics in Chinese Data Compression --- p.38 / Chapter 3.1.1 --- Large and Not Fixed Size Character Set --- p.38 / Chapter 3.1.2 --- Lack of Word Segmentation --- p.40 / Chapter 3.1.3 --- Rich Semantic Meaning of Chinese Characters --- p.40 / Chapter 3.1.4 --- Grammatical Variance of Chinese Language --- p.41 / Chapter 3.2 --- Definition of Different Coding Schemes --- p.41 / Chapter 3.2.1 --- Big5 Code --- p.42 / Chapter 3.2.2 --- GB (Guo Biao) Code --- p.43 / Chapter 3.2.3 --- Unicode --- p.44 / Chapter 3.2.4 --- HZ (Hanzi) Code --- p.45 / Chapter 3.3 --- Entropy of Chinese and Other Languages --- p.45 / Chapter 4 --- Huffman Coding on Chinese Text --- p.49 / Chapter 4.1 --- The use of the Chinese Character Identification Routine --- p.50 / Chapter 4.2 --- Result --- p.51 / Chapter 4.3 --- Justification of the Result --- p.53 / Chapter 4.4 --- Time and Memory Resources Analysis --- p.58 / Chapter 4.5 --- The Heuristic Order-n Huffman Coding for Chinese Text Com- pression --- p.61 / Chapter 4.5.1 --- The Algorithm --- p.62 / Chapter 4.5.2 --- Result --- p.63 / Chapter 4.5.3 --- Justification of the Result --- p.64 / Chapter 4.6 --- Chapter Conclusion --- p.66 / Chapter 5 --- The Ziv-Lempel Compression on Chinese Text --- p.67 / Chapter 5.1 --- The Chinese LZSS Compression --- p.68 / Chapter 5.1.1 --- The Algorithm --- p.69 / Chapter 5.1.2 --- Result --- p.73 / Chapter 5.1.3 --- Justification of the Result --- p.74 / Chapter 5.1.4 --- Time and Memory Resources Analysis --- p.80 / Chapter 5.1.5 --- Effects in Controlling the Parameters --- p.81 / Chapter 5.2 --- The Chinese LZW Compression --- p.92 / Chapter 5.2.1 --- The Algorithm --- p.92 / Chapter 5.2.2 --- Result --- p.94 / Chapter 5.2.3 --- Justification of the Result --- p.95 / Chapter 5.2.4 --- Time and Memory Resources Analysis --- p.97 / Chapter 5.2.5 --- Effects in Controlling the Parameters --- p.98 / Chapter 5.3 --- A Comparison of the performance of the LZSS and the LZW --- p.100 / Chapter 5.4 --- Chapter Conclusion --- p.101 / Chapter 6 --- Chinese Dictionary-based Huffman coding --- p.103 / Chapter 6.1 --- The Algorithm --- p.104 / Chapter 6.2 --- Result --- p.107 / Chapter 6.3 --- Justification of the Result --- p.108 / Chapter 6.4 --- Effects of Changing the Size of the Dictionary --- p.111 / Chapter 6.5 --- Chapter Conclusion --- p.114 / Chapter 7 --- Cascading of Huffman coding and LZW compression --- p.116 / Chapter 7.1 --- Static Cascading Model --- p.117 / Chapter 7.1.1 --- The Algorithm --- p.117 / Chapter 7.1.2 --- Result --- p.120 / Chapter 7.1.3 --- Explanation and Analysis of the Result --- p.121 / Chapter 7.2 --- Adaptive (Dynamic) Cascading Model --- p.125 / Chapter 7.2.1 --- The Algorithm --- p.125 / Chapter 7.2.2 --- Result --- p.126 / Chapter 7.2.3 --- Explanation and Analysis of the Result --- p.127 / Chapter 7.3 --- Chapter Conclusion --- p.128 / Chapter 8 --- Concluding Remarks --- p.129 / Chapter 8.1 --- Conclusion --- p.129 / Chapter 8.2 --- Future Work Direction --- p.130 / Chapter 8.2.1 --- Improvement in Efficiency and Resources Consumption --- p.130 / Chapter 8.2.2 --- The Compressibility of Chinese and Other Languages --- p.131 / Chapter 8.2.3 --- Use of Grammar Model --- p.131 / Chapter 8.2.4 --- Lossy Compression --- p.131 / Chapter 8.3 --- Epilogue --- p.132 / Bibliography --- p.133
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_320605 |
Date | January 1995 |
Contributors | Kan, Chi-kwun., Chinese University of Hong Kong Graduate School. Division of Computer Science. |
Publisher | Chinese University of Hong Kong |
Source Sets | The Chinese University of Hong Kong |
Language | English |
Detected Language | English |
Type | Text, bibliography |
Format | print, xv, 137 leaves : ill. ; 30 cm. |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.0015 seconds