Global ETD Search

1	"Aplicação de técnicas de data mining em logs de servidores web" Chiara, Ramon 09 May 2003 (has links) Com o advento da Internet, as empresas puderam mostrar-se para o mundo. A possibilidade de colocar um negócio na World Wide Web (WWW) criou um novo tipo de dado que as empresas podem utilizar para melhorar ainda mais seu conhecimento sobre o mercado: a sequência de cliques que um usuário efetua em um site. Esse dado pode ser armazenado em uma espécie de Data Warehouse para ser analisado com técnicas de descoberta de conhecimento em bases de dados. Assim, há a necessidade de se realizar pesquisas para mostrar como retirar conhecimento a partir dessas sequências de cliques. Neste trabalho são discutidas e analisadas algumas das técnicas utilizadas para atingir esse objetivo. é proposta uma ferramenta onde os dados dessas sequências de cliques são mapeadas para o formato atributo-valor utilizado pelo Sistema Discover, um sistema sendo desenvolvindo em nosso Laboratório para o planejamento e execução de experimentos relacionados aos algoritmos de aprendizado utilizados durante a fase de Mineração de Dados do processo de descoberta de conhecimento em bases de dados. Ainda, é proposta a utilização do sistema de Programação Lógica Indutiva chamado Progol para extrair conhecimento relacional das sessões de sequências de cliques que caracterizam a interação de usuários com as páginas visitadas no site. Experimentos iniciais com a utilização de uma sequência de cliques real foram realizados usando Progol e algumas das facilidades já implementadas pelo Sistema Discover. Data Mining Web Mining
2	How Much of It is Real? Analysis of Paid Placement in Web Search Engine Results Nicholson, Scott, Sierra, Tito, Eseryel, U. Yeliz, Park, Ji-Hong, Barkow, Philip, Pozo, Erika J., Ward, Jane January 2005 (has links) Most Web search tools integrate sponsored results with results from their internal editorial database in providing results to users. The goal of this research is to get a better idea of how much of the screen real estate displays â realâ editorial results as compared to sponsored results. The overall average results are that 40% of all results presented on the first screen are â realâ results, and when the entire first Web page is considered, 67% of the results are non-sponsored results. For general search tools like Google, 56% of the first screen and 82% of the first Web page contain non-sponsored results. Other results include that query structure makes a significant difference in the percentage of non-sponsored results returned by a search. Similarly, the topic of the query can also have a significant effect on the percentage of sponsored results displayed by most Web search tools. Web Mining Internet
3	A Proposal for Categorization and Nomenclature for Web Search Tools Nicholson, Scott January 2000 (has links) Also published in Journal of Internet Cataloging, 2(3/4), 9-28, 2000 / Ambiguities in Web search tool (more commonly known as "search engine") terminology are problematic when conducting precise, replicable research or when teaching others to use search tools. Standardized terminology would enable Web searchers to be aware of subtle differences between Web search tools and the implications of these for searching. A categorization and nomenclature for standardized classifications of different aspects of Web search tools is proposed, and advantages and disadvantages of using tools in each category are discussed. Web Mining Internet
4	HelpfulMed: Intelligent Searching for Medical Information over the Internet Chen, Hsinchun, Lally, Ann M., Zhu, Bin, Chau, Michael 05 1900 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / Medical professionals and researchers need information from reputable sources to accomplish their work. Unfortunately, the Web has a large number of documents that are irrelevant to their work, even those documents that purport to be â medically-related.â This paper describes an architecture designed to integrate advanced searching and indexing algorithms, an automatic thesaurus, or â concept space,â and Kohonen-based Self-Organizing Map (SOM) technologies to provide searchers with finegrained results. Initial results indicate that these systems provide complementary retrieval functionalities. HelpfulMed not only allows users to search Web pages and other online databases, but also allows them to build searches through the use of an automatic thesaurus and browse a graphical display of medical-related topics. Evaluation results for each of the different components are included. Our spidering algorithm outperformed both breadth-first search and PageRank spiders on a test collection of 100,000 Web pages. The automatically generated thesaurus performed as well as both MeSH and UMLSâ systems which require human mediation for currency. Lastly, a variant of the Kohonen SOM was comparable to MeSH terms in perceived cluster precision and significantly better at perceived cluster recall. Web Mining Medical Libraries
5	Design and evaluation of a multi-agent collaborative Web mining system Chau, Michael, Zeng, Daniel, Chen, Hsinchun, Huang, Michael, Hendriawan, David 04 1900 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / Most existing Web search tools work only with individual users and do not help a user benefit from previous search experiences of others. In this paper, we present the Collaborative Spider, a multi-agent system designed to provide post-retrieval analysis and enable across-user collaboration in Web search and mining. This system allows the user to annotate search sessions and share them with other users. We also report a user study designed to evaluate the effectiveness of this system. Our experimental findings show that subjectsâ search performance was degraded, compared to individual search scenarios in which users had no access to previous searches, when they had access to a limited number (e.g., 1 or 2) of earlier search sessions done by other users. However, search performance improved significantly when subjects had access to more search sessions. This indicates that gain from collaboration through collaborative Web searching and analysis does not outweigh the overhead of browsing and comprehending other usersâ past searches until a certain number of shared sessions have been reached. In this paper, we also catalog and analyze several different types of user collaboration behavior observed in the context of Web mining. Web Mining Internet
6	Indexing and Abstracting on the World Wide Web: An Examination of Six Web Databases Nicholson, Scott January 1997 (has links) Web databases, commonly known as search engines or web directories, are currently the most useful way to search the Internet. In this article, the author draws from library literature to develop a series of questions that can be used to analyze these web searching tools. Six popular web databases are analyzed using this method. Using this analysis, the author creates three categories for web databases and explores the most appropriate searches to perform with each. The work concludes with a proposal for the ideal web database. Web Mining Internet
7	"Aplicação de técnicas de data mining em logs de servidores web" Ramon Chiara 09 May 2003 (has links) Com o advento da Internet, as empresas puderam mostrar-se para o mundo. A possibilidade de colocar um negócio na World Wide Web (WWW) criou um novo tipo de dado que as empresas podem utilizar para melhorar ainda mais seu conhecimento sobre o mercado: a sequência de cliques que um usuário efetua em um site. Esse dado pode ser armazenado em uma espécie de Data Warehouse para ser analisado com técnicas de descoberta de conhecimento em bases de dados. Assim, há a necessidade de se realizar pesquisas para mostrar como retirar conhecimento a partir dessas sequências de cliques. Neste trabalho são discutidas e analisadas algumas das técnicas utilizadas para atingir esse objetivo. é proposta uma ferramenta onde os dados dessas sequências de cliques são mapeadas para o formato atributo-valor utilizado pelo Sistema Discover, um sistema sendo desenvolvindo em nosso Laboratório para o planejamento e execução de experimentos relacionados aos algoritmos de aprendizado utilizados durante a fase de Mineração de Dados do processo de descoberta de conhecimento em bases de dados. Ainda, é proposta a utilização do sistema de Programação Lógica Indutiva chamado Progol para extrair conhecimento relacional das sessões de sequências de cliques que caracterizam a interação de usuários com as páginas visitadas no site. Experimentos iniciais com a utilização de uma sequência de cliques real foram realizados usando Progol e algumas das facilidades já implementadas pelo Sistema Discover. Data Mining Web Mining
8	Introduction to the JASIST Special Topic Section on Web Retrieval and Mining: A Machine Learning Perspective Chen, Hsinchun 05 1900 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / Research in information retrieval (IR) has advanced significantly in the past few decades. Many tasks, such as indexing and text categorization, can be performed automatically with minimal human effort. Machine learning has played an important role in such automation by learning various patterns such as document topics, text structures, and user interests from examples. In recent years, it has become increasingly difficult to search for useful information on the World Wide Web because of its large size and unstructured nature. Useful information and resources are often hidden in the Web. While machine learning has been successfully applied to traditional IR systems, it poses some new challenges to apply these algorithms to the Web due to its large size, link structure, diversity in content and languages, and dynamic nature. On the other hand, such characteristics of the Web also provide interesting patterns and knowledge that do not present in traditional information retrieval systems. Web Mining World Wide Web
9	Comparison of Three Vertical Search Spiders Chau, Michael, Chen, Hsinchun 05 1900 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / Spiders are the software agents that search engines use to collect content for their databases. We investigated algorithms to improve the performance of vertical search engine spiders. The investigation addressed three approaches: a breadth-first graph-traversal algorithm with no heuristics to refine the search process, a best-first traversal algorithm that used a hyperlink-analysis heuristic, and a spreading-activation algorithm based on modeling the Web as a neural network. Web Mining Internet Data Mining
10	Special issue: "Web retrieval and mining" Chen, Hsinchun 04 1900 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / Search engines and data mining are two research areas that have experienced significant progress over the past few years. Overwhelming acceptance of the Internet as a primary medium for content delivery and business transactions has created unique opportunities and challenges for researchers. The richness of the webâ s multimedia content, the reach and timeliness of web-based publication, the proliferation of e-commerce activities and the potential for wireless web delivery have generated many interesting research problems. Technical, system, organizational and social research approaches are all needed to address these research problems. Many interesting webretrieval and mining research topics have emerged recently. These include, but are not limited to, the following: text and data mining on the web, web visualization, web intelligence and agents, web-based decision support and knowledge management, wireless web retrieval and visualization, web-based usability methodology, web-based analysis for eCommerce applications. This special issue consists of nine papers that report research in web retrieval and mining. Web Mining World Wide Web

Search results