Wong Chi Yin. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 107-114). / Abstract also in Chinese. / Abstract --- p.ii / Acknowledgements --- p.vi / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Introduction to Chinese IR --- p.1 / Chapter 1.2 --- Contributions --- p.3 / Chapter 1.3 --- Organization of this Thesis --- p.5 / Chapter 2 --- Background --- p.6 / Chapter 2.1 --- Indexing methods --- p.6 / Chapter 2.1.1 --- Full-text scanning --- p.7 / Chapter 2.1.2 --- Inverted files --- p.7 / Chapter 2.1.3 --- Signature files --- p.9 / Chapter 2.1.4 --- Clustering --- p.10 / Chapter 2.2 --- Information Retrieval Models --- p.10 / Chapter 2.2.1 --- Boolean model --- p.11 / Chapter 2.2.2 --- Vector space model --- p.11 / Chapter 2.2.3 --- Probabilistic model --- p.13 / Chapter 2.2.4 --- Logical model --- p.14 / Chapter 3 --- Investigation of Segmentation on the Vector Space Retrieval Model --- p.15 / Chapter 3.1 --- Segmentation of Chinese Texts --- p.16 / Chapter 3.1.1 --- Character-based segmentation --- p.16 / Chapter 3.1.2 --- Word-based segmentation --- p.18 / Chapter 3.1.3 --- N-Gram segmentation --- p.21 / Chapter 3.2 --- Performance Evaluation of Three Segmentation Approaches --- p.23 / Chapter 3.2.1 --- Experimental Setup --- p.23 / Chapter 3.2.2 --- Experimental Results --- p.24 / Chapter 3.2.3 --- Discussion --- p.29 / Chapter 4 --- Signature File Background --- p.32 / Chapter 4.1 --- Superimposed coding --- p.34 / Chapter 4.2 --- False drop probability --- p.36 / Chapter 5 --- Partitioned Signature File Based On Chinese Word Length --- p.39 / Chapter 5.1 --- Fixed Weight Block (FWB) Signature File --- p.41 / Chapter 5.2 --- Overview of PSFC --- p.45 / Chapter 5.3 --- Design Considerations --- p.50 / Chapter 6 --- New Hashing Techniques for Partitioned Signature Files --- p.59 / Chapter 6.1 --- Direct Division Method --- p.61 / Chapter 6.2 --- Random Number Assisted Division Method --- p.62 / Chapter 6.3 --- Frequency-based hashing method --- p.64 / Chapter 6.4 --- Chinese character-based hashing method --- p.68 / Chapter 7 --- Experiments and Results --- p.72 / Chapter 7.1 --- Performance evaluation of partitioned signature file based on Chi- nese word length --- p.74 / Chapter 7.1.1 --- Retrieval Performance --- p.75 / Chapter 7.1.2 --- Signature Reduction Ratio --- p.77 / Chapter 7.1.3 --- Storage Requirement --- p.79 / Chapter 7.1.4 --- Discussion --- p.81 / Chapter 7.2 --- Performance evaluation of different dynamic signature generation methods --- p.82 / Chapter 7.2.1 --- Collision --- p.84 / Chapter 7.2.2 --- Retrieval Performance --- p.86 / Chapter 7.2.3 --- Discussion --- p.89 / Chapter 8 --- Conclusions and Future Work --- p.91 / Chapter 8.1 --- Conclusions --- p.91 / Chapter 8.2 --- Future work --- p.95 / Chapter A --- Notations of Signature Files --- p.96 / Chapter B --- False Drop Probability --- p.98 / Chapter C --- Experimental Results --- p.103 / Bibliography --- p.107
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_322281 |
Date | January 1998 |
Contributors | Wong, Chi Yin., Chinese University of Hong Kong Graduate School. Division of Systems Engineering and Engineering Management. |
Source Sets | The Chinese University of Hong Kong |
Language | English, Chinese |
Detected Language | English |
Type | Text, bibliography |
Format | print, x, 114 leaves : ill. ; 30 cm. |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.0014 seconds