• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 195
  • 29
  • 26
  • 22
  • 11
  • 10
  • 7
  • 5
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 367
  • 367
  • 121
  • 106
  • 103
  • 92
  • 84
  • 69
  • 67
  • 65
  • 60
  • 52
  • 44
  • 42
  • 41
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Online Query Refinement on Information Retrieval Systems: A Process Model of Searched System Interactions

Chen, Hsinchun, Dhar, Vasant January 1990 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / This article reports findings of empirical research that investigated information searchers online query refinement process. Prior studies have recognized the information specialists' role in helping searchers articulate and refine queries. Using a semantic network and a Problem Behavior Graph to represent the online search our study revealed that searchers also refined their own queries in an online task environment. The information retrieval system played a passive role in assisting online query refinement, which was, however, one that confirmed Taylor's four-level query formulation model. Based on our empirical findings, we proposed using process model to facilitate and improve query refinement in an online environment. We believe incorporating this model into retrieval systems can result in the design of more "intelligent" and useful information retrieval systems.
32

Comparison of Two Approaches to Building a Vertical Search Tool: A Case Study in the Nanotechnology Domain

Chau, Michael, Chen, Hsinchun, Qin, Jailun, Zhou, Yilu, Qin, Yi, Sung, Wai-Ki, McDonald, Daniel M. January 2002 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / As the Web has been growing exponentially, it has become increasingly difficult to search for desired information. In recent years, many domain-specific (vertical) search tools have been developed to serve the information needs of specific fields. This paper describes two approaches to building a domain-specific search tool. We report our experience in building two different tools in the nanotechnology domain - (1) a server-side search engine, and (2) a client-side search agent. The designs of the two search systems are presented and discussed, and their strengths and weaknesses are compared. Some future research directions are also discussed.
33

A Concept Space Approach to Addressing the Vocabulary Problem in Scientific Information Retrieval: An Experiment on the Worm Community System

Chen, Hsinchun, Martinez, Joanne, Ng, Tobun Dorbin, Schatz, Bruce R. 01 1900 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / This research presents an algorithmic approach to addressing the vocabulary problem in scientific information retrieval and information sharing, using the molecular biology domain as an example. We first present a literature review of cognitive studies related to the vocabulary problem and vocabuiary-based search aids (thesauri) and then discuss techniques for building robust and domain-specific thesauri to assist in cross-domain scientific information retrieval. Using a variation of the automatic thesaurus generation techniques, which we refer to as the concept space approach, we recently conducted an experiment in the molecular biology domain in which we created a C. elegans worm thesaurus of 7,657 worm-specific terms and a Drosofila fly thesaurus of 15,626 terms. About 30% of these terms overlapped, which created vocabulary paths from one subject domain to the other. Based on a cognitive study of term association involving four biologists, we found that a large percentage (59.6-85.6%) of the terms suggested by the subjects were identified in the conjoined fly-worm thesaurus. However, we found only a small percentage (8.4-18.1%) of the associations suggested by the subjects in the thesaurus. In a follow-up document retrieval study involving eight fly biologists, an actual worm database (Worm Community System), and the conjoined flyworm thesaurus, subjects were able to find more relevant documents (an increase from about 9 documents to 20) and to improve the document recall level (from 32.41 to 65.28%) when using the thesaurus, although the precision level did not improve significantly. Implications of adopting the concept space approach for addressing the vocabulary problem in Internet and digital libraries applications are also discussed.
34

Electronic Commerce and Digital Libraries

Houston, Andrea L., Chen, Hsinchun January 2000 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / In this chapter we discuss digital libraries from an electronic commerce perspective. The focus is on what the two have in common. The first section is an introduction which discuses some of the impacts that digital libraries and electronic commerce have had on our lives. The second section discusses common driving forces behind the two. The next section discusses common challenges, with an emphasis on the digital library perspective. The fourth section discusses several common issues, in particular, social, legal, quality, security and economic issues that both digital libraries and electronic commerce must address. The discussion in the fourth section primarily presents a digital library perspective, although the issu-es are important to both digital libraries and electronic commerce. Finally, the chapter closes with a conclusion.
35

Automated knowledge extraction from text

Bowden, Paul Richard January 1999 (has links)
No description available.
36

The generation of compound nominals to represent the essence of text : the COMMIX system

Norris, Jennifer Vivien January 1998 (has links)
This thesis concerns the COMMIX system, which automatically extracts information on what a text is about, and generates that information in the highly compacted form of compound nominal expressions. The expressions generated are complex and may include novel terms which do not appear themselves in the input text. From the practical point of view, the work is driven by the need for better representations of content: for representations which are shorter and more concise than would appear in an abstract, yet more informative and representative of the actual aboutness than commonly occurs in indexing expressions and key terms. This additional layer of representation is referred to in this work as pertaining to the essence of a particular text. From a theoretical standpoint, the thesis shows how the compound nominal as a construct can be successfully employed in these highly informative representations. It involves an exploration of the claim that there is sufficient semantic information contained within the standard dictionary glosses for individual words to enable the construction of useful and highly representative novel compound nominal expressions, without recourse to standard syntactic and statistical methods. It shows how a shallow semantic approach to content identification which is based on lexical overlap can produce some very encouraging results. The methodology employed, and described herein, is domain-independent, and does not require the specification of templates with which the input text must comply. In these two respects, the methodology developed in this work avoids two of the most common problems associated with information extraction. As regards the evaluation of this type of work, the thesis introduces and utilises the notion of percentage attainment value, which is used in conjunction with subjects' opinions about the degree to which the aboutness terms succeed in indicating the subject matter of the texts for which they were generated.
37

Einsatz von Text Mining zur Prognose kurzfristiger Trends von Aktienkursen nach der Publikation von Unternehmensnachrichten

Mittermayer, Marc-André January 2005 (has links)
Zugl.: Bern, Univ., Diss., 2005
38

A Hybrid Approach for Ontology-based Information Extraction

Gutierrez, Fernando 23 February 2016 (has links)
Information extraction (IE) is the process of automatically transforming written natural language (i.e., text) into structured information, such as a knowledge base. However, because natural language is inherently ambiguous, this transformation process is highly complex. On the other hand, as Information Extraction moves from the analysis of scientific documents to the analysis of Internet textual content, we cannot rely completely on the assumption that the content of the text is correct. Indeed, in contrast to scientific documents, which are peer reviewed, Internet content is not verified for the quality and correctness. Thus, two main issues that affect the IE process are the complexity of the extraction process and the quality of the data. In this dissertation, we propose an improved ontology-based IE (OBIE) by providing solutions to these issues of accuracy and content quality. Based on a hybrid strategy that combines aspects of IE that are usually considered as opposite to each other, or that are not even considered, we intend to improve IE by developing a more accurate extraction and new functionality (semantic error detection). Our approach is based on OBIE, a sub-area of IE, which reduces extraction complexity by including domain knowledge, in the form of concepts and relationships of the domain, to guide the extraction process. We address the complexity of extraction by combining information extractors that have different implementations. By integrating different types of implementation into one extraction system, we can produce a more accurate extraction. For each concept or relationship in the ontology, we can select the best implementation for extraction, or we can combine both implementations under an ensemble learning schema. In tandem, we address the quality of information by determining its semantic correctness with regard to domain knowledge. We define two methods for semantic error detection: by predefining the types of errors expected in the text or by applying logic reasoning to the text. This dissertation includes both published and unpublished coauthored material.
39

Automating the Extraction of Domain-Specific Information from the Web-A Case Study for the Genealogical Domain

Walker, Troy L. 23 November 2004 (has links) (PDF)
Current ways of finding genealogical information within the millions of pages on the Web are inadequate. In an effort to help genealogical researchers find desired information more quickly, we have developed GeneTIQS, a Genealogy Target-based Information Query System. GeneTIQS builds on ontology-based methods of data extraction to allow database-style queries on the Web. This thesis makes two main contributions to GeneTIQS. (1) It builds a framework to do generic ontology-based data extraction. (2) It develops a hybrid record separator based on Vector Space Modeling that uses both formatting clues and data clues to split pages into component records. The record separator allows GeneTIQS to extract data from the complex documents common in genealogy. Experiments show that this approach yields 92% recall and 93% precision on documents from the Web.
40

Automated Information Extraction to Support Biomedical Decision Model Construction: A Preliminary Design

Li, Xiaoli, Leong, Tze Yun 01 1900 (has links)
We propose an information extraction framework to support automated construction of decision models in biomedicine. Our proposed technique classifies text-based documents from a large biomedical literature repository, e.g., MEDLINE, into predefined categories, and identifies important keywords for each category based on their discriminative power. Relevant documents for each category are retrieved based on the keywords, and a classification algorithm is developed based on machine learning techniques to build the final classifier. We apply the HITS algorithm to select the authoritative and typical documents within a category, and construct templates in the form of Bayesian networks. Data mining and information extraction techniques are then applied to extract the necessary semantic knowledge to fill in the templates to construct the final decision models. / Singapore-MIT Alliance (SMA)

Page generated in 0.1487 seconds