Global ETD Search

11	Inductive Query by Examples (IQBE): A Machine Learning Approach Chen, Hsinchun, She, Linlin January 1994 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / This paper presents an incremental, inductive learning approach to query-by examples for information retrieval (IR) and database management systems (DBMS). After briefly reviewing conventional information retrieval techniques and the prevailing database query paradigms, we introduce the ID5R algorithm, previously developed by Utgoff, for ``intelligent'' and system-supported query processing. We describe in detail how we adapted the ID5R algorithm for IR/DBMS applications and we present two examples, one for IR applications and the other for DBMS applications, to demonstrate the feasibility of the approach. Using a larger test collection of about 1000 document records from the COMPEN CD-ROM computing literature database and using recall as a performance measure, our experiment showed that the incremental ID5R performed significantly better than a batch inductive learning algorithm (called ID3) which we developed earlier. Both algorithms, however, were robust and efficient in helping users develop abstract queries from examples. We believe this research has shed light on the feasibility and the novel characteristics of a new query paradigm, namely, inductive query-by examples (IQBE). Directions of our current research are summarized at the end of the paper. Databases Information Extraction
12	Updateable PAT-Tree Approach to Chinese Key Phrase Extraction using Mutual Information: A Linguistic Foundation for Knowledge Management Ong, Thian-Huat, Chen, Hsinchun January 1999 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / There has been renewed research interest in using the statistical approach to extraction of key phrases from Chinese documents because existing approaches do not allow online frequency updates after phrases have been extracted. This consequently results in inaccurate, partial extraction. In this paper, we present an updateable PAT-tree approach. In our experiment, we compared our approach with that of Lee-Feng Chien with that showed an improvement in recall from 0.19 to 0.43 and in precision from 0.52 to 0.70. This paper also reviews the requirements for a data structure that facilitates implementation of any statistical approaches to key-phrase extraction, including PATtree, PAT-array and suffix array with semi-infinite strings. Knowledge Management Information Extraction
13	Multilingual input system for the Web - an open multimedia approach of keyboard and handwritten recognition for Chinese and Japanese Ramsey, Marshall C., Ong, Thian-Huat, Chen, Hsinchun January 1998 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / The basic building block of a multilingual information retrieval system is the input system. Chinese and Japanese characters pose great challenges for the conventional 101-key alphabet-based keyboard, because they are radical-based and number in the thousands. This paper reviews the development of various approaches and then presents a framework and working demonstrations of Chinese and Japanese input methods implemented in Java, which allow open deployment over the web to any platform, The demo includes both popular keyboard input methods and neural network handwriting recognition using a mouse or pen. This framework is able to accommodate future extension to other input mediums and languages of interest. Artificial Intelligence Information Extraction
14	Concept-based searching and browsing: a geoscience experiment Hauck, Roslin V., Sewell, Robin R., Ng, Tobun Dorbin, Chen, Hsinchun January 2001 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / In the recent literature, we have seen the expansion of information retrieval techniques to include a variety of different collections of information. Collections can have certain characteristics that can lead to different results for the various classification techniques. In addition, the ways and reasons that users explore each collection can affect the success of the information retrieval technique. The focus of this research was to extend the application of our statistical and neural network techniques to the domain of geological science information retrieval. For this study, a test bed of 22,636 geoscience abstracts was obtained through the NSF/DARPA/NASA funded Alexandria Digital Library Initiative project at the University of California at Santa Barbara. This collection was analyzed using algorithms previously developed by our research group: concept space algorithm for searching and a Kohonen self-organizing map (SOM) algorithm for browsing. Included in this paper are discussions of our techniques, user evaluations and lessons learned. Digital Libraries Information Extraction
15	DGPort: A Web Portal for Digital Government Yin, C.Q., Nickels, L.D., Chen, C.Z., Ng, Gavin, Chen, Hsinchun January 2003 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / This paper provides a summary of the initial development of a Web portal for the digital government domain. Information retrieval techniques commonly used to find information on the Internet are discussed along with the problems associated with these techniques that led to the development of the Digital Government Web portal (DGPort). We also discuss the advantages that DGPort could have for researchers in the digital government domain as well as the value-added features that this portal provides. Future evaluation plans for the portal are also described. Internet Information Extraction
16	An Issues Identifier for Online Financial Databases Yen, J., Chen, Hsinchun, Ma, P., Bui, T. January 1995 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / A major problem that decision makers are facing in an information-rich society is how to absorb, filter and make effective use of available data. The problem caused by information overflow could lead to the losses of competitiveness. This paper presents a knowledge-based approach to building an issues identifier to help investors overcome information overflow problems when dealing with very large on-line financial databases. The proposed software system is able to extract critical issues from the on-line financial databases. The system was developed based on a number of techniques: automatic indexing, concept space genemtion, and neural network classification. In this paper, we describe how these techniques are used to extract subject descriptors, their semantic relationships, and the related texts (documents or paragraphs) to each descriptor. The proposed system has been tested with the annual reports from thirteen of the largest international banks. Databases Information Extraction Classification
17	Semantic Retrieval for the NCSA Mosaic Chen, Hsinchun, Schatz, Bruce R. January 1994 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / In this paper we report an automatic and scalable concept space approach to enhancing the deep searching capability of the NCSA Mosaic. The research, which is based on the findings from a previous NSF National Collaboratory project and which will be expanded in a new Illinois NSF/ARPA/NASA Digital Library project, centers around semantic retrieval and user customization. Semantic retrieval supports a higher level of abstraction in user search, which can overcome the vocabulary problem for information retrieval. Rather than searching for words within the object space, the search is for terms within a concept space (graph of terms occurring within objects linked to each other by the frequency with which they occur together). Co-occurrence graphs seem to provide good suggestive power in specialized domains, such as biology. By providing a more understandable, system-generated, semantics-rich concept space as an abstraction of the enormously complex object space plus algorithms and interface to assist in object/concept spaces traversal, we believe we can greatly alleviate both information overload and the vocabulary problem of internet services. These techniques will also be used to provide a form of customized retrieval and automatic information routing. Results from past research, the specific algorithms and techniques, and the research plan for enhancing the NCSA Mosaic's search capability in the NSF/ARPA/NASA Digital Library project will be discussed. Digital Libraries Information Extraction
18	A Graph Model for E-Commerce Recommender Systems Huang, Zan, Chung, Wingyan, Chen, Hsinchun January 2004 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / Information overload on the Web has created enormous challenges to customers selecting products for online purchases and to online businesses attempting to identify customersâ preferences efficiently. Various recommender systems employing different data representations and recommendation methods are currently used to address these challenges. In this research, we developed a graph model that provides a generic data representation and can support different recommendation methods. To demonstrate its usefulness and flexibility, we developed three recommendation methods: direct retrieval, association mining, and high-degree association retrieval. We used a data set from an online bookstore as our research test-bed. Evaluation results showed that combining product content information and historical customer transaction information achieved more accurate predictions and relevant recommendations than using only collaborative information. However, comparisons among different methods showed that high-degree association retrieval did not perform significantly better than the association mining method or the direct retrieval method in our test-bed. Data Mining Information Extraction
19	A Path to Concept-based Information Access: From National Collaboratories to Digital Libraries Houston, Andrea L., Chen, Hsinchun January 2000 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / This research aims to provide a semantic, concept-based retrieval option that could supplement existing information retrieval options. Our proposed approach is based on textual analysis of a large corpus of domain-specific documents in order to generate a large set of subject vocabularies. By adopting cluster analysis techniques to analyze the co-occurrence probabilities of the subject vocabularies, a similarity matrix of vocabularies can be built to represent the important concepts and their weighted “relevance” relationships in the subject domain. To create a network of concepts, which we refer to as the “concept space” for the subject domain, we propose to develop general AI-based graph traversal algorithms and graph matching algorithms to automatically translate a searcher’ s preferred vocabularies into a set of the most semantically relevant terms in the database’s underlying subject domain. By providing a more understandable, system-generated, semantics-rich concept space plus algorithms to assist in concept/information spaces traversal, we believe we can greatly alleviate both information overload and the vocabulary problem. In this chapter, we first review our concept space approach and the associated algorithms in Section 2. In Section 3, we describe our experience in using such an approach. In Section 4, we summarize our research findings and our plan for building a semantics-rich Interspace for the Illinois Digital Library project. Digital Libraries Information Extraction
20	Knowledge-Based Document Retrieval: Framework and Design Chen, Hsinchun 06 1900 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / This article presents research on the design of knowledge-based document retrieval systems. We adopted a semantic network structure to represent subject knowledge and classification scheme knowledge and modeled experts' search strategies and user modeling capability as procedural knowledge. These functionalities were incorporated into a prototype knowledge-based retrieval system, Metacat. Our system, the design of which was based on the blackboard architecture, was able to create a user profile, identify task requirements, suggest heuristics-based search strategies, perform semantic-based search assistance, and assist online query refinement. Artificial Intelligence Information Extraction

Search results