Spelling suggestions: "subject:"text retrieval"" "subject:"text etrieval""
1 |
Arabic information retrieval system based on morphological analysis (AIRSMA) : a comparative study of word, stem, root and morpho-semantic methodsAl Tayyar, Musaid Seleh January 2000 (has links)
No description available.
|
2 |
A study in distributed document retrievalEstall, Craig January 1985 (has links)
No description available.
|
3 |
Text knowledge : the Quirk ExperimentsHolmes-Higgin, Paul January 1995 (has links)
Our research examines text knowledge: the knowledge encoded in text and the knowledge about a text. We approach text knowledge from different perspectives, describing the theories and techniques that have been applied to extracting, representing and deploying this knowledge, and propose some novel techniques that may enhance the understanding of text knowledge. These techniques include the concept of virtual corpus hierarchies, hybrid symbolic and connectionist representation and reasoning, text analysis and self-organising corpora. We present these techniques in a framework that embraces the different facets of text knowledge as a whole, be it corpus organisation and text identification, text analysis, or knowledge representation and reasoning. This framework comprises three phases, that of organisation, analysis and evaluation of text, where a single text might be a complete work, a technical term, or even a single letter. The techniques proposed are demonstrated by implementations of computer systems and some experiments based on these implementations: the Quirk Experiments. Through these experiments we show how the highly interconnected nature of text knowledge can be reduced or abstracted for specific purposes, from a range of techniques based on explicit symbolic representations and self-organising connectionist schemes.
|
4 |
Integrating Knowledge Maps From Distributed Document RepositoriesYan, Ming-De 14 July 2003 (has links)
In this thesis, we propose a knowledge map integration system to merge distributed knowledge maps into a global knowledge map based on the concept mapping methodology. This system performs the functions of knowledge map integration and knowledge map maintenance. The knowledge map integration function integrates different local knowledge maps specified by distributed organizations into a global knowledge map, and knowledge seekers can access the overall knowledge structure about the domain knowledge. Besides, the local knowledge maps in different organizations vary dynamically due to accumulation of information. Consequently, the demand for knowledge map maintenance increases in order to keep the global knowledge map up to date. The function of knowledge map maintenance can update the variations of every local knowledge map, and change the global structure simultaneously. The knowledge map integration system is evaluated by master thesis repository at National Central Library, and we obtain good results.
|
5 |
Federated Text Retrieval from Independent CollectionsShokouhi, Milad, milads@microsoft.com January 2008 (has links)
Federated information retrieval is a technique for searching multiple text collections simultaneously. Queries are submitted to a subset of collections that are most likely to return relevant answers. The results returned by selected collections are integrated and merged into a single list. Federated search is preferred over centralized search alternatives in many environments. For example, commercial search engines such as Google cannot index uncrawlable hidden web collections; federated information retrieval systems can search the contents of hidden web collections without crawling. In enterprise environments, where each organization maintains an independent search engine, federated search techniques can provide parallel search over multiple collections. There are three major challenges in federated search. For each query, a subset of collections that are most likely to return relevant documents are selected. This creates the collection selection problem. To be able to select suitable collections, federated information retrieval systems acquire some knowledge about the contents of each collection, creating the collection representation problem. The results returned from the selected collections are merged before the final presentation to the user. This final step is the result merging problem. In this thesis, we propose new approaches for each of these problems. Our suggested methods, for collection representation, collection selection, and result merging, outperform state-of-the-art techniques in most cases. We also propose novel methods for estimating the number of documents in collections, and for pruning unnecessary information from collection representations sets. Although management of document duplication has been cited as one of the major problems in federated search, prior research in this area often assumes that collections are free of overlap. We investigate the effectiveness of federated search on overlapped collections, and propose new methods for maximizing the number of distinct relevant documents in the final merged results. In summary, this thesis introduces several new contributions to the field of federated information retrieval, including practical solutions to some historically unsolved problems in federated search, such as document duplication management. We test our techniques on multiple testbeds that simulate both hidden web and enterprise search environments.
|
6 |
Knowledge Map Discovery in Virtual Communities of PracticeHsueh, Chih-Ming 17 July 2001 (has links)
This thesis proposes a knowledge management technique, called knowledge map management system, which consists of three components: knowledge navigation, knowledge seeking, and learning history advisory. The knowledge navigation function helps knowledge seekers to understand the knowledge structure. Similarly, the knowledge-seeking component helps knowledge seekers to retrieve knowledge with its structure. The learning adviser generalizes the knowledge access pattern in problem solving history and advises knowledge in the problem solving process. In this study, we prototype the knowledge map management system to facilitate knowledge sharing activities. In order to keep the knowledge map up to date, we also propose the incremental approach to maintaining knowledge structure with the addition of knowledge artifacts. The knowledge map creation and maintenance functions are evaluated using documents collected from SCTNet and master thesis repository at National Central Library, and result in encouraging results.
|
7 |
Implications of Punctuation Mark Normalization on Text RetrievalKim, Eungi 08 1900 (has links)
This research investigated issues related to normalizing punctuation marks from a text retrieval perspective. A punctuated-centric approach was undertaken by exploring changes in meanings, whitespaces, words retrievability, and other issues related to normalizing punctuation marks. To investigate punctuation normalization issues, various frequency counts of punctuation marks and punctuation patterns were conducted using the text drawn from the Gutenberg Project archive and the Usenet Newsgroup archive. A number of useful punctuation mark types that could aid in analyzing punctuation marks were discovered. This study identified two types of punctuation normalization procedures: (1) lexical independent (LI) punctuation normalization and (2) lexical oriented (LO) punctuation normalization. Using these two types of punctuation normalization procedures, this study discovered various effects of punctuation normalization in terms of different search query types. By analyzing the punctuation normalization problem in this manner, a wide range of issues were discovered such as: the need to define different types of searching, to disambiguate the role of punctuation marks, to normalize whitespaces, and indexing of punctuated terms. This study concluded that to achieve the most positive effect in a text retrieval environment, normalizing punctuation marks should be based on an extensive systematic analysis of punctuation marks and punctuation patterns and their related factors. The results of this study indicate that there were many challenges due to complexity of language. Further, this study recommends avoiding a simplistic approach to punctuation normalization.
|
8 |
A planning approach to migrating domain-specific legacy systems into service oriented architectureZhang, Zhuo January 2012 (has links)
The planning work prior to implementing an SOA migration project is very important for its success. Up to now, most of this kind of work has been manual work. An SOA migration planning approach based on intelligent information processing methods is addressed to semi-automate the manual work. This thesis will investigate the principle research question: 'How can we obtain SOA migration planning schemas (semi-) automatically instead of by traditional manual work in order to determine if legacy software systems should be migrated to SOA computation environment?'. The controlled experiment research method has been adopted for directing research throughout the whole thesis. Data mining methods are used to analyse SOA migration source and migration targets. The mined information will be the supplementation of traditional analysis results. Text similarity measurement methods are used to measure the matching relationship between migration sources and migration targets. It implements the quantitative analysis of matching relationships instead of common qualitative analysis. Concretely, an association rule and sequence pattern mining algorithms are proposed to analyse legacy assets and domain logics for establishing a Service model and a Component model. These two algorithms can mine all motifs with any min-support number without assuming any ordering. It is better than the existing algorithms for establishing Service models and Component models in SOA migration situations. Two matching strategies based on keyword level and superficial semantic levels are described, which can calculate the degree of similarity between legacy components and domain services effectively. Two decision-making methods based on similarity matrix and hybrid information are investigated, which are for creating SOA migration planning schemas. Finally a simple evaluation method is depicted. Two case studies on migrating e-learning legacy systems to SOA have been explored. The results show the proposed approach is encouraging and applicable. Therefore, the SOA migration planning schemas can be created semi-automatically instead of by traditional manual work by using data mining and text similarity measurement methods.
|
9 |
Sentiment analysis within and across social media streamsMejova, Yelena Aleksandrovna 01 May 2012 (has links)
Social media offers a powerful outlet for people's thoughts and feelings -- it is an enormous ever-growing source of texts ranging from everyday observations to involved discussions. This thesis contributes to the field of sentiment analysis, which aims to extract emotions and opinions from text. A basic goal is to classify text as expressing either positive or negative emotion. Sentiment classifiers have been built for social media text such as product reviews, blog posts, and even Twitter messages. With increasing complexity of text sources and topics, it is time to re-examine the standard sentiment extraction approaches, and possibly to re-define and enrich sentiment definition. Thus, this thesis begins by introducing a rich multi-dimensional model based on Affect Control Theory and showing its usefulness in sentiment classification. Next, unlike sentiment analysis research to date, we examine sentiment expression and polarity classification within and across various social media streams by building topical datasets. When comparing Twitter, reviews, and blogs on consumer product topics, we show that it is possible, and sometimes even beneficial, to train sentiment classifiers on text sources which are different from the target text. This is not the case, however, when we compare political discussion in YouTube comments to Twitter posts, demonstrating the difficulty of political sentiment classification. We further show that neither discussion volume or sentiment expressed in these streams correspond well to national polls, putting in question recent research linking the two. The complexity of political discussion also calls for a more specific re-definition of "sentiment" as agreement with the author's political stance. We conclude that sentiment must be defined, and tools for its analysis designed, within a larger framework of human interaction.
|
10 |
SIGNATURE FILES FOR DOCUMENT MANAGEMENTABEYSINGHE, RUVINI PRADEEPA 11 October 2001 (has links)
No description available.
|
Page generated in 0.0379 seconds