41 |
SUPPORTING DOMAIN SPECIFIC WEB-BASED SEARCH USING HEURISTIC KNOWLEDGE EXTRACTIONGunanathan, Sudharsan 16 January 2010 (has links)
Modern search engines like Google support domain-independent search over the
vast information contained in web documents. However domain-specific information
access, such as finding less well-known people, locations, and events are not performed
efficiently without users developing sophisticated query strategies. This thesis describes
the design and development of an application to support one such domain-specific
information activity: for insurance (and related) companies to identify weather and
natural disaster damage to better assess when and where personnel will be needed. The
approach presented to supporting such activity combines information extraction with an
interactive presentation of results. Previous domain specific search engines extract
information about papers, people, and course information using rule-based or learningbased
techniques. However they use the results of information extraction in a typical
query and list of results interface. They fail to address the need for interaction based on
the extracted document features. The domain specific web-based search application
developed in this project combines information extraction with the interactive display of results to facilitate rapid information location. A heuristic evaluation was performed to
determine whether the application met the design goals and to improve the design.
Thus the final application has an unconventional but interactive presentation of
the results with the use of tree based display. The application also allows options for user
specific results caching and modification of the search and caching process. With a
heuristic based search process it extracts information about place, date and damages
regarding a specific disaster using a bank of search heuristics developed.
|
42 |
Semantic Issues for Digital LibrariesChen, Hsinchun January 2000 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / As new and emerging classes of information systems applications the applications become more overwhelming, pressing, and diverse, several well-known information retrieval (IR) problems have become even more urgent in this “network-centric” information age. Information overload, a result of the ease of information creation and rendering via the Internet and the World Wide Web, has become more evident in people’s lives. Significant variations of database formats and structures, the richness of information media, and an abundance of multilingual information content also have created severe information interoperability problems-structural interoperability, media interoperability, and multilingual interoperability. The conventional approaches to addressing information overload and information interoperability problems are manual in nature, requiring human experts as information intermediaries to create knowledge structures and/or ontologies. As information content and collections become even larger and more dynamic, we believe a systemaided bottom-up artificial intelligence (AI) approach is needed. By applying scalable techniques developed in various AI subareas such as image segmentation and indexing, voice recognition, natural language processing, neural networks, machine learning, clustering and categorization, and intelligent agents, we can provide an alternative system-aided approach to addressing both information overload and information interoperability.
|
43 |
An intelligent personal spider (agent) for dynamic Internet/Intranet searchingChen, Hsinchun, Chung, Yi-Ming, Ramsey, Marshall C., Yang, Christopher C. 05 1900 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / As Internet services based on the World-Wide Web become more popular, information overload has become a pressing research problem. Difficulties with search on Internet will worsen as the amount of on-line information increases. A scalable approach to Internet search is critical to the success of Internet services and other current and future National Information Infrastructure (NII) applications. As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent personal spider (agent) approach to Internet searching. The approach, which is grounded on automatic textual analysis and general-purpose search algorithms, is expected to be an improvement over the current static and inefficient Internet searches. In this experiment, we implemented Internet personal spiders based on best first search and genetic algorithm techniques. These personal spiders can dynamically take a user's selected starting homepages and search for the most closely related homepages in the web, based on the links and keyword indexing. A plain, static CGI/HTML-based interface was developed earlier, followed by a recent enhancement of a graphical, dynamic Java-based interface. Preliminary evaluation results and two working prototypes (available for Web access) are presented. Although the examples and evaluations presented are mainly based on Internet applications, the applicability of the proposed techniques to the potentially more rewarding Intranet applications should be obvious. In particular, we believe the proposed agent design can be used to locate organization-wide information, to gather new, time-critical organizational information, and to support team-building and communication in Intranets.
|
44 |
Communication roles that support collaboration during the design processSonnenwald, Diane H. 07 1900 (has links)
It is widely acknowledged that design (and development) teams increasingly include participants from different domains who must explore and integrate their specialized knowledge in order to create innovative and competitive artefacts and reduce design and development costs. Thus communication, integration of specialized knowledge, and negotiation of differences among domain specialists has emerged as a fundamental component of the design process. This paper presents thirteen communication roles that emerged during four multi-disciplinary design situations in the USA and Europe. These roles supported knowledge exploration and integration, collaboration, and task and project completion by filtering and providing information and negotiating differences across organizational, task, discipline, and personal boundaries. Implications for design methods, tools and education are discussed.
|
45 |
A Smart Itsy Bitsy Spider for the WebChen, Hsinchun, Chung, Yi-Ming, Ramsey, Marshall C., Yang, Christopher C. January 1998 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent agent approach to Web searching. In this experiment, we developed two Web personal spiders based on best first search and genetic algorithm techniques, respectively. These personal spiders can dynamically take a userâ s
selected starting homepages and search for the most closely related homepages in the Web, based on the links
and keyword indexing. A graphical, dynamic, Java-based interface was developed and is available for Web access. A system architecture for implementing such an agent-based spider is presented, followed by detailed discussions of benchmark testing and user evaluation results. In benchmark testing, although the genetic algorithm spider did not outperform the best first search spider, we found both results to be comparable and complementary. In user evaluation, the genetic algorithm spider obtained significantly higher recall value than that of the best first search spider. However, their precision values were not statistically different. The mutation process
introduced in genetic algorithm allows users to find other potential relevant homepages that cannot be explored via a conventional local search process. In addition, we found the Java-based interface to be a necessary component for design of a truly interactive and dynamic Web agent.
|
46 |
Collaborative Information Retrieval Environment: Integration of Information Retrieval with Group Support SystemsRomano, Nicholas C., Roussinov, Dmitri G., Nunamaker, Jay F., Chen, Hsinchun January 1999 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / Observations of Information Retrieval (IR) system user
experiences reveal a strong desire for collaborative search
while at the same time suggesting that collaborative
capabilities are rarely, and then only in a limited fashion,
supported by current searching and visualization tools.
Equally interesting is the fact that observations of user
experiences with Group Support Systems (GSS) reveal that
although access to external information and the ability to
search for relevant material is often vital to the progress of
GSS sessions, integrated support for collaborative searching
and visualization of results is lacking in GSS systems. After
reviewing both user experiences described in IR and GSS
literature and observing and interviewing users of existing
IR and GSS commercial and prototype systems, the authors
conclude that there is an obvious demand for systems
supporting multi-user IR.. It is surprising to the authors that
very little attention has been given to the common ground
shared by these two important research domains. With this
in mind, our paper describes how user experiences with IR
and GSS systems has shed light on a promising new area of
collaborative research and led to the development of a
prototype that merges the two paradigms into a
Collaborative Information Retrieval Environment (CIRE).
Finally the paper presents theory developed from initial user
experiences with our prototype and describes plans to test
the efficacy of this new paradigm empirically through
controlled experimentation.
|
47 |
The Illinois Digital Library Initiative Project: Federating Repositories and Semantic ResearchChen, Hsinchun January 2001 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / The Illinois DLI Project, one of six projects funded by the NSF/DARPA/NASA DLI, consists of two major components: (1) a production testbed based in a real library (SGML publisher stream deployed at the University of Illinois at Urbana-Champaign, UIUC) and (2) fundamental technology research for semantic interoperability (semantic indexes across subjects and media developed at the University of Arizona). The Illinois DLI production testbed was developed in the Grainger Engineering library at UIUC. It supports full SGML federated structure search on an experimental Web-based interface. The initial rollout was available at the UIUC campus in October 1997 and has been integrated with the library information services. The testbed consist of materials from 5 publishers, 55 engineering journals, and 40,000 full-text articles. The testbed was implemented using SoftQuad (SGML rendering) and OpenText (full-text search), both commercial software.
|
48 |
Genescene: Biomedical Text And Data MiningLeroy, Gondy, Chen, Hsinchun, Martinez, Jesse D., Eggers, Shauna, Falsey, Ryan R., Kislin, Kerri L., Huang, Zan, Li, Jiexun, Xu, Jie, McDonald, Daniel M., Ng, Gavin January 2005 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / To access the content of digital texts efficiently, it is
necessary to provide more sophisticated access than
keyword based searching. Genescene provides biomedical
researchers with research findings and background
relations automatically extracted from text and
experimental data. These provide a more detailed
overview of the information available. The extracted
relations were evaluated by qualified researchers and are
precise. A qualitative ongoing evaluation of the current
online interface indicates that this method to search the
literature is more useful and efficient than keyword based
searching.
|
49 |
Internet Categorization and Search: A Self-Organizing ApproachChen, Hsinchun, Schuffels, Chris, Orwig, Richard E. January 1996 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / The problems of information overload and vocabulary differences have become more pressing with the emergence of increasingly popular Internet services. The main information retrieval mechanisms provided by the prevailing Internet WWW software are based on either keyword search (e.g., the Lycos server at CMU, the Yahoo server at Stanford) or hypertext browsing (e.g., Mosaic and Netscape). This research aims to provide an alternative concept-based categorization and search capability for WWW servers based on selected machine learning algorithms. Our proposed approach, which is grounded on automatic textual analysis of Internet documents (homepages), attempts to address the Internet search problem by first categorizing the content of Internet documents. We report results of our recent testing of a multilayered neural network clustering
algorithm employing the Kohonen self-organizing feature map to categorize (classify) Internet homepages according
to their content. The category hierarchies created could serve to partition the vast Internet services into subject-specific categories and databases and improve Internet keyword searching and/or browsing.
|
50 |
Automatic multi-document summarization for digital librariesOu, Shiyan, Khoo, Christopher S.G., Goh, Dion H. January 2006 (has links)
With the rapid growth of the World Wide Web and online information services, more and more information is available and accessible online. Automatic summarization is an indispensable solution to reduce the information overload problem. Multi-document summarization is useful to provide an overview of a topic and allow users to zoom in for more details on aspects of interest. This paper reports three types of multi-document summaries generated for a set of research abstracts, using different summarization approaches: a sentence-based summary generated by a MEAD summarization system that extracts important sentences using various features, another sentence-based summary generated by extracting research objective sentences, and a variable-based summary focusing on research concepts and relationships. A user evaluation was carried out to compare the three types of summaries. The evaluation results indicated that the majority of users (70%) preferred the variable-based summary, while 55% of the users preferred the research objective summary, and only 25% preferred the MEAD summary.
|
Page generated in 0.1427 seconds