Global ETD Search

11	Concept-based biomedical text retrieval / Zhong, Ming. January 2007 (has links) Thesis (M.Sc.)--York University, 2007. Graduate Programme in Computer Science. / Typescript. Includes bibliographical references (leaves 96-101). Also available on the Internet. MODE OF ACCESS via web browser by entering the following URL: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&res_dat=xri:pqdiss&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&rft_dat=xri:pqdiss:MR29634
12	Learning to classify text using support vector machines / Joachims, Thorsten. January 2002 (has links) Univ., Diss.--Dortmund, 2001. / Includes bibliographical references (p. [181] - 196) and index.
13	Active learning with committees : an approach to efficient learning in text categorization using linear threshold algorithms / Liere, Ray. January 1900 (has links) Thesis (Ph. D.)--Oregon State University, 2000. / Typescript (photocopy). Includes bibliographical references (leaves 282-294). Also available on the World Wide Web.
14	Topic representations for natural language applications / Lacatusu, Valeriu Finley. January 2007 (has links) Thesis (Ph.D.)--University of Texas at Dallas, 2007. / Includes vita. Includes bibliographical references (leaves 113-123)
15	Automated psychological categorization via linguistic processing system Sutter, Christopher M., Eramo, Mark D. 09 1900 (has links) Approved for public release; distribution is unlimited / Influencing one's adversary has always been an objective in warfare. However, to date the majority of influence operations have been geared toward the masses or to very small numbers of individuals. Although marginally effective, this approach is inadequate with respect to larger numbers of high value targets and to specific subsets of the population. Limited human resources have prevented a more tailored approach, which would focus on segmentation, because individual targeting demands significant time from psychological analysts. This research examined whether or not Information Technology (IT) tools, specializing in text mining, are robust enough to automate the categorization/segmentation of individual profiles for the purpose of psychological operations (PSYOP). Research indicated that only a handful of software applications claimed to provide adequate functionality to perform these tasks. Text mining via neural networks was determined to be the best approach given the constraints of the profile data and the desired output. Five software applications were tested and evaluated for their ability to reproduce the results of a social psychologist. Through statistical analysis, it was concluded that the tested applications are not currently mature enough to produce accurate results that would enable automated segmentation of individual profiles based on supervised linguistic processing. / Captain, United States Marine Corps / Lieutenant, United States Navy Social psychology Psychological warfare Text processing (Computer science) Computational linguistics
16	A method for finding common attributes in hetrogenous DoD databases Zobair, Hamza A. 06 1900 (has links) Approved for public release; distribution is unlimited. / Traditional database development has been done for a specific, self-contained purpose with no plan to share or merge the data with other databases in the future. As these systems have matured, users have realized a requirement exists to share their data. Finding common attributes among databases is a time consuming task. However, it is one that is necessary as more and more corporations and agencies consolidate operations. In terms of DoD, the requirement to consolidate systems has come about, as the various data systems used by DoD agencies and our allies need to communicate with each other for a well-coordinated operation. One alternative for achieving the desired interconnectivity is to specify the requirement for interoperability in new systems. A more practical, less costly process is to merge existing systems and consolidate the common components. This paper proposes a process for consolidating portions of data dictionaries of two existing databases. The proposed method uses commercial-off-the-shelf software in finding common attributes between multiple databases and represents an improvement in accuracy and time over previous methods. Database management Electronic data processing Text processing (Computer science)
17	Text compression for Chinese documents. January 1995 (has links) by Chi-kwun Kan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1995. / Includes bibliographical references (leaves 133-137). / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Importance of Text Compression --- p.1 / Chapter 1.2 --- Historical Background of Data Compression --- p.2 / Chapter 1.3 --- The Essences of Data Compression --- p.4 / Chapter 1.4 --- Motivation and Objectives of the Project --- p.5 / Chapter 1.5 --- Definition of Important Terms --- p.6 / Chapter 1.5.1 --- Data Models --- p.6 / Chapter 1.5.2 --- Entropy --- p.10 / Chapter 1.5.3 --- Statistical and Dictionary-based Compression --- p.12 / Chapter 1.5.4 --- Static and Adaptive Modelling --- p.12 / Chapter 1.5.5 --- One-Pass and Two-Pass Modelling --- p.13 / Chapter 1.6 --- Benchmarks and Measurements of Results --- p.15 / Chapter 1.7 --- Sources of Testing Data --- p.16 / Chapter 1.8 --- Outline of the Thesis --- p.16 / Chapter 2 --- Literature Survey --- p.18 / Chapter 2.1 --- Data compression Algorithms --- p.18 / Chapter 2.1.1 --- Statistical Compression Methods --- p.18 / Chapter 2.1.2 --- Dictionary-based Compression Methods (Ziv-Lempel Fam- ily) --- p.23 / Chapter 2.2 --- Cascading of Algorithms --- p.33 / Chapter 2.3 --- Problems of Current Compression Programs on Chinese --- p.34 / Chapter 2.4 --- Previous Chinese Data Compression Literatures --- p.37 / Chapter 3 --- Chinese-related Issues --- p.38 / Chapter 3.1 --- Characteristics in Chinese Data Compression --- p.38 / Chapter 3.1.1 --- Large and Not Fixed Size Character Set --- p.38 / Chapter 3.1.2 --- Lack of Word Segmentation --- p.40 / Chapter 3.1.3 --- Rich Semantic Meaning of Chinese Characters --- p.40 / Chapter 3.1.4 --- Grammatical Variance of Chinese Language --- p.41 / Chapter 3.2 --- Definition of Different Coding Schemes --- p.41 / Chapter 3.2.1 --- Big5 Code --- p.42 / Chapter 3.2.2 --- GB (Guo Biao) Code --- p.43 / Chapter 3.2.3 --- Unicode --- p.44 / Chapter 3.2.4 --- HZ (Hanzi) Code --- p.45 / Chapter 3.3 --- Entropy of Chinese and Other Languages --- p.45 / Chapter 4 --- Huffman Coding on Chinese Text --- p.49 / Chapter 4.1 --- The use of the Chinese Character Identification Routine --- p.50 / Chapter 4.2 --- Result --- p.51 / Chapter 4.3 --- Justification of the Result --- p.53 / Chapter 4.4 --- Time and Memory Resources Analysis --- p.58 / Chapter 4.5 --- The Heuristic Order-n Huffman Coding for Chinese Text Com- pression --- p.61 / Chapter 4.5.1 --- The Algorithm --- p.62 / Chapter 4.5.2 --- Result --- p.63 / Chapter 4.5.3 --- Justification of the Result --- p.64 / Chapter 4.6 --- Chapter Conclusion --- p.66 / Chapter 5 --- The Ziv-Lempel Compression on Chinese Text --- p.67 / Chapter 5.1 --- The Chinese LZSS Compression --- p.68 / Chapter 5.1.1 --- The Algorithm --- p.69 / Chapter 5.1.2 --- Result --- p.73 / Chapter 5.1.3 --- Justification of the Result --- p.74 / Chapter 5.1.4 --- Time and Memory Resources Analysis --- p.80 / Chapter 5.1.5 --- Effects in Controlling the Parameters --- p.81 / Chapter 5.2 --- The Chinese LZW Compression --- p.92 / Chapter 5.2.1 --- The Algorithm --- p.92 / Chapter 5.2.2 --- Result --- p.94 / Chapter 5.2.3 --- Justification of the Result --- p.95 / Chapter 5.2.4 --- Time and Memory Resources Analysis --- p.97 / Chapter 5.2.5 --- Effects in Controlling the Parameters --- p.98 / Chapter 5.3 --- A Comparison of the performance of the LZSS and the LZW --- p.100 / Chapter 5.4 --- Chapter Conclusion --- p.101 / Chapter 6 --- Chinese Dictionary-based Huffman coding --- p.103 / Chapter 6.1 --- The Algorithm --- p.104 / Chapter 6.2 --- Result --- p.107 / Chapter 6.3 --- Justification of the Result --- p.108 / Chapter 6.4 --- Effects of Changing the Size of the Dictionary --- p.111 / Chapter 6.5 --- Chapter Conclusion --- p.114 / Chapter 7 --- Cascading of Huffman coding and LZW compression --- p.116 / Chapter 7.1 --- Static Cascading Model --- p.117 / Chapter 7.1.1 --- The Algorithm --- p.117 / Chapter 7.1.2 --- Result --- p.120 / Chapter 7.1.3 --- Explanation and Analysis of the Result --- p.121 / Chapter 7.2 --- Adaptive (Dynamic) Cascading Model --- p.125 / Chapter 7.2.1 --- The Algorithm --- p.125 / Chapter 7.2.2 --- Result --- p.126 / Chapter 7.2.3 --- Explanation and Analysis of the Result --- p.127 / Chapter 7.3 --- Chapter Conclusion --- p.128 / Chapter 8 --- Concluding Remarks --- p.129 / Chapter 8.1 --- Conclusion --- p.129 / Chapter 8.2 --- Future Work Direction --- p.130 / Chapter 8.2.1 --- Improvement in Efficiency and Resources Consumption --- p.130 / Chapter 8.2.2 --- The Compressibility of Chinese and Other Languages --- p.131 / Chapter 8.2.3 --- Use of Grammar Model --- p.131 / Chapter 8.2.4 --- Lossy Compression --- p.131 / Chapter 8.3 --- Epilogue --- p.132 / Bibliography --- p.133 Text processing (Computer science) Chinese language--Data processing
18	On-line learning for adaptive text filtering. January 1999 (has links) Yu Kwok Leung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1999. / Includes bibliographical references (leaves 91-96). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- The Problem --- p.1 / Chapter 1.2 --- Information Filtering --- p.2 / Chapter 1.3 --- Contributions --- p.7 / Chapter 1.4 --- Organization Of The Thesis --- p.10 / Chapter 2 --- Related Work --- p.12 / Chapter 3 --- Adaptive Text Filtering --- p.22 / Chapter 3.1 --- Representation --- p.22 / Chapter 3.1.1 --- Textual Document --- p.23 / Chapter 3.1.2 --- Filtering Profile --- p.28 / Chapter 3.2 --- On-line Learning Algorithms For Adaptive Text Filtering --- p.29 / Chapter 3.2.1 --- The Sleeping Experts Algorithm --- p.29 / Chapter 3.2.2 --- The EG-based Algorithms --- p.32 / Chapter 4 --- The REPGER Algorithm --- p.37 / Chapter 4.1 --- A New Approach --- p.37 / Chapter 4.2 --- Relevance Prediction By RElevant feature Pool --- p.42 / Chapter 4.3 --- Retrieving Good Training Examples --- p.45 / Chapter 4.4 --- Learning Dissemination Threshold Dynamically --- p.49 / Chapter 5 --- The Threshold Learning Algorithm --- p.50 / Chapter 5.1 --- Learning Dissemination Threshold Dynamically --- p.50 / Chapter 5.2 --- Existing Threshold Learning Techniques --- p.51 / Chapter 5.3 --- A New Threshold Learning Algorithm --- p.53 / Chapter 6 --- Empirical Evaluations --- p.55 / Chapter 6.1 --- Experimental Methodology --- p.55 / Chapter 6.2 --- Experimental Settings --- p.59 / Chapter 6.3 --- Experimental Results --- p.62 / Chapter 7 --- Integrating With Feature Clustering --- p.76 / Chapter 7.1 --- Distributional Clustering Algorithm --- p.79 / Chapter 7.2 --- Integrating With Our REPGER Algorithm --- p.82 / Chapter 7.3 --- Empirical Evaluation --- p.84 / Chapter 8 --- Conclusions --- p.87 / Chapter 8.1 --- Summary --- p.87 / Chapter 8.2 --- Future Work --- p.88 / Bibliography --- p.91 / Chapter A --- Experimental Results On The AP Corpus --- p.97 / Chapter A.1 --- The EG Algorithm --- p.97 / Chapter A.2 --- The EG-C Algorithm --- p.98 / Chapter A.3 --- The REPGER Algorithm --- p.100 / Chapter B --- Experimental Results On The FBIS Corpus --- p.102 / Chapter B.1 --- The EG Algorithm --- p.102 / Chapter B.2 --- The EG-C Algorithm --- p.103 / Chapter B.3 --- The REPGER Algorithm --- p.105 / Chapter C --- Experimental Results On The WSJ Corpus --- p.107 / Chapter C.1 --- The EG Algorithm --- p.107 / Chapter C.2 --- The EG-C Algorithm --- p.108 / Chapter C.3 --- The REPGER Algorithm --- p.110 Text processing (Computer science) Information retrieval Hypertext systems
19	A probabilistic approach for automatic text filtering. January 1998 (has links) Low Kon Fan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 165-168). / Abstract also in Chinese. / Abstract --- p.i / Acknowledgment --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview of Information Filtering --- p.1 / Chapter 1.2 --- Contributions --- p.4 / Chapter 1.3 --- Organization of this thesis --- p.6 / Chapter 2 --- Existing Approaches --- p.7 / Chapter 2.1 --- Representational issues --- p.7 / Chapter 2.1.1 --- Document Representation --- p.7 / Chapter 2.1.2 --- Feature Selection --- p.11 / Chapter 2.2 --- Traditional Approaches --- p.15 / Chapter 2.2.1 --- NewsWeeder --- p.15 / Chapter 2.2.2 --- NewT --- p.17 / Chapter 2.2.3 --- SIFT --- p.19 / Chapter 2.2.4 --- InRoute --- p.20 / Chapter 2.2.5 --- Motivation of Our Approach --- p.21 / Chapter 2.3 --- Probabilistic Approaches --- p.23 / Chapter 2.3.1 --- The Naive Bayesian Approach --- p.25 / Chapter 2.3.2 --- The Bayesian Independence Classifier Approach --- p.28 / Chapter 2.4 --- Comparison --- p.31 / Chapter 3 --- Our Bayesian Network Approach --- p.33 / Chapter 3.1 --- Backgrounds of Bayesian Networks --- p.34 / Chapter 3.2 --- Bayesian Network Induction Approach --- p.36 / Chapter 3.3 --- Automatic Construction of Bayesian Networks --- p.38 / Chapter 4 --- Automatic Feature Discretization --- p.50 / Chapter 4.1 --- Predefined Level Discretization --- p.52 / Chapter 4.2 --- Lloyd's algorithm . . > --- p.53 / Chapter 4.3 --- Class Dependence Discretization --- p.55 / Chapter 5 --- Experiments and Results --- p.59 / Chapter 5.1 --- Document Collections --- p.60 / Chapter 5.2 --- Batch Filtering Experiments --- p.63 / Chapter 5.3 --- Batch Filtering Results --- p.65 / Chapter 5.4 --- Incremental Session Filtering Experiments --- p.87 / Chapter 5.5 --- Incremental Session Filtering Results --- p.88 / Chapter 6 --- Conclusions and Future Work --- p.105 / Appendix A --- p.107 / Appendix B --- p.116 / Appendix C --- p.126 / Appendix D --- p.131 / Appendix E --- p.145 Text processing (Computer science) Bayesian statistical decision theory Information retrieval
20	Multi-lingual text retrieval and mining. January 2003 (has links) Law Yin Yee. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2003. / Includes bibliographical references (leaves 130-134). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Cross-Lingual Information Retrieval (CLIR) --- p.2 / Chapter 1.2 --- Bilingual Term Association Mining --- p.5 / Chapter 1.3 --- Our Contributions --- p.6 / Chapter 1.3.1 --- CLIR --- p.6 / Chapter 1.3.2 --- Bilingual Term Association Mining --- p.7 / Chapter 1.4 --- Thesis Organization --- p.8 / Chapter 2 --- Related Work --- p.9 / Chapter 2.1 --- CLIR Techniques --- p.9 / Chapter 2.1.1 --- Existing Approaches --- p.9 / Chapter 2.1.2 --- Difference Between Our Model and Existing Approaches --- p.13 / Chapter 2.2 --- Bilingual Term Association Mining Techniques --- p.13 / Chapter 2.2.1 --- Existing Approaches --- p.13 / Chapter 2.2.2 --- Difference Between Our Model and Existing Approaches --- p.17 / Chapter 3 --- Cross-Lingual Information Retrieval (CLIR) --- p.18 / Chapter 3.1 --- Cross-Lingual Query Processing and Translation --- p.18 / Chapter 3.1.1 --- Query Context and Document Context Generation --- p.20 / Chapter 3.1.2 --- Context-Based Query Translation --- p.23 / Chapter 3.1.3 --- Query Term Weighting --- p.28 / Chapter 3.1.4 --- Final Weight Calculation --- p.30 / Chapter 3.2 --- Retrieval on Documents and Automated Summaries --- p.32 / Chapter 4 --- Experiments on Cross-Lingual Information Retrieval --- p.38 / Chapter 4.1 --- Experimental Setup --- p.38 / Chapter 4.2 --- Results of English-to-Chinese Retrieval --- p.45 / Chapter 4.2.1 --- Using Mono-Lingual Retrieval as the Gold Standard --- p.45 / Chapter 4.2.2 --- Using Human Relevance Judgments as the Gold Stan- dard --- p.49 / Chapter 4.3 --- Results of Chinese-to-English Retrieval --- p.53 / Chapter 4.3.1 --- Using Mono-lingual Retrieval as the Gold Standard --- p.53 / Chapter 4.3.2 --- Using Human Relevance Judgments as the Gold Stan- dard --- p.57 / Chapter 5 --- Discovering Comparable Multi-lingual Online News for Text Mining --- p.61 / Chapter 5.1 --- Story Representation --- p.62 / Chapter 5.2 --- Gloss Translation --- p.64 / Chapter 5.3 --- Comparable News Discovery --- p.67 / Chapter 6 --- Mining Bilingual Term Association Based on Co-occurrence --- p.75 / Chapter 6.1 --- Bilingual Term Cognate Generation --- p.75 / Chapter 6.2 --- Term Mining Algorithm --- p.77 / Chapter 7 --- Phonetic Matching --- p.87 / Chapter 7.1 --- Algorithm Design --- p.87 / Chapter 7.2 --- Discovering Associations of English Terms and Chinese Terms --- p.93 / Chapter 7.2.1 --- Converting English Terms into Phonetic Representation --- p.93 / Chapter 7.2.2 --- Discovering Associations of English Terms and Man- darin Chinese Terms --- p.100 / Chapter 7.2.3 --- Discovering Associations of English Terms and Can- tonese Chinese Terms --- p.104 / Chapter 8 --- Experiments on Bilingual Term Association Mining --- p.111 / Chapter 8.1 --- Experimental Setup --- p.111 / Chapter 8.2 --- Result and Discussion of Bilingual Term Association Mining Based on Co-occurrence --- p.114 / Chapter 8.3 --- Result and Discussion of Phonetic Matching --- p.121 / Chapter 9 --- Conclusions and Future Work --- p.126 / Chapter 9.1 --- Conclusions --- p.126 / Chapter 9.1.1 --- CLIR --- p.126 / Chapter 9.1.2 --- Bilingual Term Association Mining --- p.127 / Chapter 9.2 --- Future Work --- p.128 / Bibliography --- p.134 / Chapter A --- Original English Queries --- p.135 / Chapter B --- Manual translated Chinese Queries --- p.137 / Chapter C --- Pronunciation symbols used by the PRONLEX Lexicon --- p.139 / Chapter D --- Initial Letter-to-Phoneme Tags --- p.141 / Chapter E --- English Sounds with their Chinese Equivalents --- p.143 Text processing (Computer science) Data mining Cross-language information retrieval

Search results