Global ETD Search

91	Indexing Compressed Text He, Meng January 2003 (has links) As a result of the rapid growth of the volume of electronic data, text compression and indexing techniques are receiving more and more attention. These two issues are usually treated as independent problems, but approaches of combining them have recently attracted the attention of researchers. In this thesis, we review and test some of the more effective and some of the more theoretically interesting techniques. Various compression and indexing techniques are presented, and we also present two compressed text indices. Based on these techniques, we implement an compressed full-text index, so that compressed texts can be indexed to support fast queries without decompressing the whole texts. The experiments show that our index is compact and supports fast search. Computer Science text compression text indexing
92	Indexing Compressed Text He, Meng January 2003 (has links) As a result of the rapid growth of the volume of electronic data, text compression and indexing techniques are receiving more and more attention. These two issues are usually treated as independent problems, but approaches of combining them have recently attracted the attention of researchers. In this thesis, we review and test some of the more effective and some of the more theoretically interesting techniques. Various compression and indexing techniques are presented, and we also present two compressed text indices. Based on these techniques, we implement an compressed full-text index, so that compressed texts can be indexed to support fast queries without decompressing the whole texts. The experiments show that our index is compact and supports fast search. Computer Science text compression text indexing
93	Algorithms for Scaled String Indexing and LCS Variants Peng, Yung-Hsing 20 July 2010 (has links) Related problems of string indexing and sequence analysis have been widely studied for a long time. Recently, researchers turn to consider extended versions of these problems, which provides more realistic applications. In this dissertation, we focus on three problems of recent interest, which are (1)the indexing problem for scaled strings, (2)the merged longest common subsequence problem and its variant with blocks, and (3)the sequence alignment problem with weighted constraints. The indexing problem for scaled strings asks one to preprocess a text string T, so that the matched positions of a pattern string P in T, with some scales £\ applied to P, can be reported efficiently. In this dissertation, we propose efficient algorithms for indexing real scaled strings, discretely scaled strings, and proportionally scaled strings. Our indexing algorithms achieve either significant improvements to previous results, or the best known results. The merged longest common subsequence (merged LCS) problem aims to detect the interleaving relationship between sequences, which has important applications to genomic and signal comparison. In this dissertation, we propose improved algorithms for finding the merged LCS. Our algorithms for finding the merged LCS are also more efficient than the previous results, especially for large alphabets. Finally, the sequence alignment problem with weighted constraints is a newly proposed problem in this dissertation. For this new problem, we first propose an efficient solution, and then show that the concept of weighted constraints can be further used to solve many constraint-related problems on sequences. Therefore, our results in this dissertation have significant contributions to the field of string indexing and sequence analysis. longest common subsequence scale string indexing algorithm
94	Summary-based document categorization with LSI Liu, Hsiao-Wen 14 February 2007 (has links) Text categorization to automatically assign documents into the appropriate pre-defined category or categories is essential to facilitating the retrieval of desired documents efficiently and effectively from a huge text depository, e.g., the world-wide web. Most techniques, however, suffer from the feature selection problem and the vocabulary mismatch problem. A few research works have addressed on text categorization via text summarization to reduce the size of documents, and consequently the number of features to consider, while some proposed using latent semantic indexing (LSI) to reveal the true meaning of a term via its association with other terms. Few works, however, have studied the joint effect of text summarization and the semantic dimension reduction technique in the literature. The objective of this research is thus to propose a practical approach, SBDR to deal with the above difficulties in text categorization tasks. Two experiments are conducted to validate our proposed approach. In the first experiment, the results show that text summarization does improve the performance in categorization. In addition, to construct important sentences, the association terms of both noun-noun and noun-verb pairs should be considered. Results of the second experiment indicate slight better performance with the approach of adopting LSI exclusively (i.e. no summarization) than that with SBDR (i.e. with summarization). Nonetheless, the minor accuracy reduction can be largely compensated for the computational time saved using LSI with text summarized. The feasibility of the SBDR approach is thus justified. Document Categorization Latent Semantic Indexing Text Summarization
95	Latent semantic web service directory and composition framework a thesis / Yick, (Winnie) Yuki B. Haungs, Michael L. January 1900 (has links) Thesis (M.S.)--California Polytechnic State University, 2009. / Mode of access: Internet. Title from PDF title page; viewed on Jan. 6, 2010. Major professor: Dr. Michael Haungs. "Presented to the faculty of California Polytechnic State University, San Luis Obispo." "In partial fulfillment of the requirements for the degree [of] Master of Science in Computer Science." "Aug 2009." Includes bibliographical references (p. 76-78).
96	Improved indexes for next generation bioinformatics applications Wu, Man-kit, Edward. January 2009 (has links) Thesis (M. Phil.)--University of Hong Kong, 2010. / Includes bibliographical references (leaves 69-72). Also available in print.
97	Application of the historic preservation index strategy to historic vernacular landscape Parmar, Sonal D. January 2008 (has links) Thesis (M.L.A.) -- University of Texas at Arlington, 2008.
98	User-defined classification on the online photo sharing site Flickr ... Or, How I learned to stop worrying and love the million typing monkeys Winget, Megan January 2006 (has links) This paper addresses the concerns related to authority and control through focused exploration and description of one of the more popular social tagging sites, Flickr (http://www.flickr.com). After providing a brief background and introduction to Flickrâ s social and practical functionalities, this paper focuses on describing the siteâ s various tagging utilities and related exploration tools, addressing the tripartite concerns regarding the lack of vocabulary control, hierarchical organization, and the policies and procedures that allow for successful classification. Classification World Wide Web Indexing Knowledge Organization
99	Social Tagging and the Next Steps for Indexing Tennis, Joseph T. January 2006 (has links) Social tagging, as a particular type of indexing, has thrown into question the nature of indexing. Is it a democratic process? Can we all benefit from user-created tags? What about the value added by professionals? Employing an evolving framework analysis, this paper addresses the question: what is next for indexing? Comparing social tagging and subject cataloguing; this paper identifies the points of similarity and difference that obtain between these two kinds of information organization frameworks. The subsequent comparative analysis of the parts of these frameworks points to the nature of indexing as an authored, personal, situational, and referential act, where differences in discursive placement divide these two species. Furthermore, this act is contingent on implicit and explicit understanding of purpose and tools available. This analysis allows us to outline desiderata for the next steps in indexing. Indexing Knowledge Structures Knowledge Organization Information Analysis
100	The Notion of the "Concept Instance": Problems in Modeling Concept Change in SKOS (Draft Discussion Paper) Tennis, Joseph T., Sutton, Stuart, Hillmann, Diane January 2006 (has links) The U.S. National Science Foundation metadata registry under development for the National Science Digital Library (NSDL) is a repertory intended to manage both metadata schemes and schemas. The focus of this draft discussion paper is on the scheme side of the development work. In particular, the concern of the discussion paper is with issues around the creation of historical snapshots of concept changes and their encoding in SKOS. Through framing the problem as we see it, we hope to find an optimal solution to our need for a SKOS encoding of these snapshots. Since what we are seeking to model is concept change, it is necessary at the outset to make it clear that we are not talking about changes to a concept of such a nature that would require the declaration a new concept with its own URI. Classification World Wide Web Indexing Information Analysis

Search results