Global ETD Search

41	SUPPORTING DOMAIN SPECIFIC WEB-BASED SEARCH USING HEURISTIC KNOWLEDGE EXTRACTION Gunanathan, Sudharsan 16 January 2010 (has links) Modern search engines like Google support domain-independent search over the vast information contained in web documents. However domain-specific information access, such as finding less well-known people, locations, and events are not performed efficiently without users developing sophisticated query strategies. This thesis describes the design and development of an application to support one such domain-specific information activity: for insurance (and related) companies to identify weather and natural disaster damage to better assess when and where personnel will be needed. The approach presented to supporting such activity combines information extraction with an interactive presentation of results. Previous domain specific search engines extract information about papers, people, and course information using rule-based or learningbased techniques. However they use the results of information extraction in a typical query and list of results interface. They fail to address the need for interaction based on the extracted document features. The domain specific web-based search application developed in this project combines information extraction with the interactive display of results to facilitate rapid information location. A heuristic evaluation was performed to determine whether the application met the design goals and to improve the design. Thus the final application has an unconventional but interactive presentation of the results with the use of tree based display. The application also allows options for user specific results caching and modification of the search and caching process. With a heuristic based search process it extracts information about place, date and damages regarding a specific disaster using a bank of search heuristics developed.
42	Semantic Issues for Digital Libraries Chen, Hsinchun January 2000 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / As new and emerging classes of information systems applications the applications become more overwhelming, pressing, and diverse, several well-known information retrieval (IR) problems have become even more urgent in this “network-centric” information age. Information overload, a result of the ease of information creation and rendering via the Internet and the World Wide Web, has become more evident in people’s lives. Significant variations of database formats and structures, the richness of information media, and an abundance of multilingual information content also have created severe information interoperability problems-structural interoperability, media interoperability, and multilingual interoperability. The conventional approaches to addressing information overload and information interoperability problems are manual in nature, requiring human experts as information intermediaries to create knowledge structures and/or ontologies. As information content and collections become even larger and more dynamic, we believe a systemaided bottom-up artificial intelligence (AI) approach is needed. By applying scalable techniques developed in various AI subareas such as image segmentation and indexing, voice recognition, natural language processing, neural networks, machine learning, clustering and categorization, and intelligent agents, we can provide an alternative system-aided approach to addressing both information overload and information interoperability. Artificial Intelligence Information Seeking Behaviors Information Extraction
43	An intelligent personal spider (agent) for dynamic Internet/Intranet searching Chen, Hsinchun, Chung, Yi-Ming, Ramsey, Marshall C., Yang, Christopher C. 05 1900 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / As Internet services based on the World-Wide Web become more popular, information overload has become a pressing research problem. Difficulties with search on Internet will worsen as the amount of on-line information increases. A scalable approach to Internet search is critical to the success of Internet services and other current and future National Information Infrastructure (NII) applications. As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent personal spider (agent) approach to Internet searching. The approach, which is grounded on automatic textual analysis and general-purpose search algorithms, is expected to be an improvement over the current static and inefficient Internet searches. In this experiment, we implemented Internet personal spiders based on best first search and genetic algorithm techniques. These personal spiders can dynamically take a user's selected starting homepages and search for the most closely related homepages in the web, based on the links and keyword indexing. A plain, static CGI/HTML-based interface was developed earlier, followed by a recent enhancement of a graphical, dynamic Java-based interface. Preliminary evaluation results and two working prototypes (available for Web access) are presented. Although the examples and evaluations presented are mainly based on Internet applications, the applicability of the proposed techniques to the potentially more rewarding Intranet applications should be obvious. In particular, we believe the proposed agent design can be used to locate organization-wide information, to gather new, time-critical organizational information, and to support team-building and communication in Intranets. Internet World Wide Web Information Extraction
44	Communication roles that support collaboration during the design process Sonnenwald, Diane H. 07 1900 (has links) It is widely acknowledged that design (and development) teams increasingly include participants from different domains who must explore and integrate their specialized knowledge in order to create innovative and competitive artefacts and reduce design and development costs. Thus communication, integration of specialized knowledge, and negotiation of differences among domain specialists has emerged as a fundamental component of the design process. This paper presents thirteen communication roles that emerged during four multi-disciplinary design situations in the USA and Europe. These roles supported knowledge exploration and integration, collaboration, and task and project completion by filtering and providing information and negotiating differences across organizational, task, discipline, and personal boundaries. Implications for design methods, tools and education are discussed. Information Extraction Information Analysis Information Systems
45	A Smart Itsy Bitsy Spider for the Web Chen, Hsinchun, Chung, Yi-Ming, Ramsey, Marshall C., Yang, Christopher C. January 1998 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent agent approach to Web searching. In this experiment, we developed two Web personal spiders based on best first search and genetic algorithm techniques, respectively. These personal spiders can dynamically take a userâ s selected starting homepages and search for the most closely related homepages in the Web, based on the links and keyword indexing. A graphical, dynamic, Java-based interface was developed and is available for Web access. A system architecture for implementing such an agent-based spider is presented, followed by detailed discussions of benchmark testing and user evaluation results. In benchmark testing, although the genetic algorithm spider did not outperform the best first search spider, we found both results to be comparable and complementary. In user evaluation, the genetic algorithm spider obtained significantly higher recall value than that of the best first search spider. However, their precision values were not statistically different. The mutation process introduced in genetic algorithm allows users to find other potential relevant homepages that cannot be explored via a conventional local search process. In addition, we found the Java-based interface to be a necessary component for design of a truly interactive and dynamic Web agent. Artificial Intelligence World Wide Web Information Extraction
46	Collaborative Information Retrieval Environment: Integration of Information Retrieval with Group Support Systems Romano, Nicholas C., Roussinov, Dmitri G., Nunamaker, Jay F., Chen, Hsinchun January 1999 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / Observations of Information Retrieval (IR) system user experiences reveal a strong desire for collaborative search while at the same time suggesting that collaborative capabilities are rarely, and then only in a limited fashion, supported by current searching and visualization tools. Equally interesting is the fact that observations of user experiences with Group Support Systems (GSS) reveal that although access to external information and the ability to search for relevant material is often vital to the progress of GSS sessions, integrated support for collaborative searching and visualization of results is lacking in GSS systems. After reviewing both user experiences described in IR and GSS literature and observing and interviewing users of existing IR and GSS commercial and prototype systems, the authors conclude that there is an obvious demand for systems supporting multi-user IR.. It is surprising to the authors that very little attention has been given to the common ground shared by these two important research domains. With this in mind, our paper describes how user experiences with IR and GSS systems has shed light on a promising new area of collaborative research and led to the development of a prototype that merges the two paradigms into a Collaborative Information Retrieval Environment (CIRE). Finally the paper presents theory developed from initial user experiences with our prototype and describes plans to test the efficacy of this new paradigm empirically through controlled experimentation. Knowledge Management Data Mining Information Extraction
47	The Illinois Digital Library Initiative Project: Federating Repositories and Semantic Research Chen, Hsinchun January 2001 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / The Illinois DLI Project, one of six projects funded by the NSF/DARPA/NASA DLI, consists of two major components: (1) a production testbed based in a real library (SGML publisher stream deployed at the University of Illinois at Urbana-Champaign, UIUC) and (2) fundamental technology research for semantic interoperability (semantic indexes across subjects and media developed at the University of Arizona). The Illinois DLI production testbed was developed in the Grainger Engineering library at UIUC. It supports full SGML federated structure search on an experimental Web-based interface. The initial rollout was available at the UIUC campus in October 1997 and has been integrated with the library information services. The testbed consist of materials from 5 publishers, 55 engineering journals, and 40,000 full-text articles. The testbed was implemented using SoftQuad (SGML rendering) and OpenText (full-text search), both commercial software. Human Computer Interaction Digital Libraries Information Extraction
48	Genescene: Biomedical Text And Data Mining Leroy, Gondy, Chen, Hsinchun, Martinez, Jesse D., Eggers, Shauna, Falsey, Ryan R., Kislin, Kerri L., Huang, Zan, Li, Jiexun, Xu, Jie, McDonald, Daniel M., Ng, Gavin January 2005 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / To access the content of digital texts efficiently, it is necessary to provide more sophisticated access than keyword based searching. Genescene provides biomedical researchers with research findings and background relations automatically extracted from text and experimental data. These provide a more detailed overview of the information available. The extracted relations were evaluated by qualified researchers and are precise. A qualitative ongoing evaluation of the current online interface indicates that this method to search the literature is more useful and efficient than keyword based searching. Data Mining Medical Libraries Information Extraction
49	Internet Categorization and Search: A Self-Organizing Approach Chen, Hsinchun, Schuffels, Chris, Orwig, Richard E. January 1996 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / The problems of information overload and vocabulary differences have become more pressing with the emergence of increasingly popular Internet services. The main information retrieval mechanisms provided by the prevailing Internet WWW software are based on either keyword search (e.g., the Lycos server at CMU, the Yahoo server at Stanford) or hypertext browsing (e.g., Mosaic and Netscape). This research aims to provide an alternative concept-based categorization and search capability for WWW servers based on selected machine learning algorithms. Our proposed approach, which is grounded on automatic textual analysis of Internet documents (homepages), attempts to address the Internet search problem by first categorizing the content of Internet documents. We report results of our recent testing of a multilayered neural network clustering algorithm employing the Kohonen self-organizing feature map to categorize (classify) Internet homepages according to their content. The category hierarchies created could serve to partition the vast Internet services into subject-specific categories and databases and improve Internet keyword searching and/or browsing. Internet Information Seeking Behaviors Information Extraction
50	Automatic multi-document summarization for digital libraries Ou, Shiyan, Khoo, Christopher S.G., Goh, Dion H. January 2006 (has links) With the rapid growth of the World Wide Web and online information services, more and more information is available and accessible online. Automatic summarization is an indispensable solution to reduce the information overload problem. Multi-document summarization is useful to provide an overview of a topic and allow users to zoom in for more details on aspects of interest. This paper reports three types of multi-document summaries generated for a set of research abstracts, using different summarization approaches: a sentence-based summary generated by a MEAD summarization system that extracts important sentences using various features, another sentence-based summary generated by extracting research objective sentences, and a variable-based summary focusing on research concepts and relationships. A user evaluation was carried out to compare the three types of summaries. The evaluation results indicated that the majority of users (70%) preferred the variable-based summary, while 55% of the users preferred the research objective summary, and only 25% preferred the MEAD summary. Information Extraction Digital Libraries Natural Language Processing

Search results