Global ETD Search

11	The Academic Web Link Database Project Thelwall, Mike, Binns, Ray, Harries, Gareth, Page-Kennedy, Teresa, Li, Xuemei, Musgrove, Peter, Price, Liz, Wilkinson, David January 2002 (has links) This project was created in response to the need for research into web links: including web link mining, and the creation of link metrics. It is aimed at providing the raw data and software for researchers to analyse link structures without having to rely upon commercial search engines, and without having to run their own web crawler. This site will contain all of the following. Complete databases of link structures of collections of academic web sites. Files of summary statistics about the link databases. Software tools for researchers to extract the information that they are particularly interested in. Descriptions of the methodologies used to crawl the web so that the information provided can be critically evaluated. *Files of information used in the web crawling process. Web Mining Web Metrics Informetrics
12	Bibliomining for Automated Collection Development in a Digital Library Setting: Using Data Mining to Discover Web-Based Scholarly Research Works Nicholson, Scott 12 1900 (has links) Based off Nicholson's 2000 University of North Texas dissertation, "CREATING A CRITERION-BASED INFORMATION AGENT THROUGH DATA MINING FOR AUTOMATED IDENTIFICATION OF SCHOLARLY RESEARCH ON THE WORLD WIDE WEB" located at http://scottnicholson.com/scholastic/finaldiss.doc / This research creates an intelligent agent for automated collection development in a digital library setting. It uses a predictive model based on facets of each Web page to select scholarly works. The criteria came from the academic library selection literature, and a Delphi study was used to refine the list to 41 criteria. A Perl program was designed to analyze a Web page for each criterion and applied to a large collection of scholarly and non-scholarly Web pages. Bibliomining, or data mining for libraries, was then used to create different classification models. Four techniques were used: logistic regression, non-parametric discriminant analysis, classification trees, and neural networks. Accuracy and return were used to judge the effectiveness of each model on test datasets. In addition, a set of problematic pages that were difficult to classify because of their similarity to scholarly research was gathered and classified using the models. The resulting models could be used in the selection process to automatically create a digital library of Web-based scholarly research works. In addition, the technique can be extended to create a digital library of any type of structured electronic information. Web Mining Data Mining Digital Libraries
13	Using Coplink to Analyze Criminal-Justice Data Hauck, Roslin V., Atabakhsh, Homa, Ongvasith, Pichai, Gupta, Harsh, Chen, Hsinchun 03 1900 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / As information technologies and applications become more overwhelming and diverse, persistent information overload problems have become ever more urgent.1 Fallout from this trend has most affected government, specifically criminaljustice information systems. The explosive growth in the digital information maintained in the data repositories of federal, state, and local criminal-justice entities and the spiraling need for cross-agency access to that information have made utilizing it both increasingly urgent and increasingly difficult. The Coplink system applies a concept spaceâ a statistics-based, algorithmic technique that identifies relationships between suspects, victims, and other pertinent dataâ to accelerate criminal investigations and enhance law enforcement efforts. Web Mining Knowledge Management World Wide Web
14	Indexing the Internet Hubbard, John 11 1900 (has links) Essay analyzes the question of what is the best way to index the Internet. Web Mining Artificial Intelligence World Wide Web
15	Digital Library Archeology: A Conceptual Framework for Understanding Library Use through Artifact-Based Evaluation Nicholson, Scott January 2005 (has links) Archeologists have used material artifacts found in a physical space to gain an understanding about the people who occupied that space. Likewise, as users wander through a digital library, they leave behind data-based artifacts of their activity in the virtual space. Digital library archeologists can gather these artifacts and employ inductive techniques, such as bibliomining, to create generalizations. These generalizations are the basis for hypotheses, which are tested to gain understanding about library services and users. In this article, the development of traditional archeological methods is presented and used to create a conceptual framework for the artifact-based evaluation in digital libraries. Data Mining Web Mining Digital Libraries Archaeology
16	Special Issue Digital Government: technologies and practices Chen, Hsinchun 02 1900 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / The Internet is changing the way we live and do business. It also offers a tremendous opportunity for government to better deliver its contents and services and interact with its many constituentsâ citizens, businesses, and other government partners. In addition to providing information, communication, and transaction services, exciting and innovative transformation could occur with the new technologies and practices. Web Mining Internet World Wide Web
17	Web Searching, Search Engines and Information Retrieval Lewandowski, Dirk January 2005 (has links) This article discusses Web search engines; mainly the challenges in indexing the World Wide Web, the user behaviour, and the ranking factors used by these engines. Ranking factors are divided into query-dependent and query-independent factors, the latter of which have become more and more important within recent years. The possibilities of these factors are limited, mainly of those that are based on the widely used link popularity measures. The article concludes with an overview of factors that should be considered to determine the quality of Web search engines. World Wide Web Web Mining Information Retrieval
18	Extraction de données à partir du Web Achir, Badr 07 1900 (has links) (PDF) Le Web est devenu riche en informations circulant à travers le monde entier via le réseau Internet. Cela a provoqué l'expansion de grandes quantités de données. De plus, ces données sont souvent non structurées et difficiles à être utilisées dans des applications Web. D'une part, l'intérêt des utilisateurs pour l'exploitation de ces données a augmenté d'une façon concurrentielle. D'autre part, les données ne sont pas faciles à être consultées par l'humain. Cet intérêt a motivé les chercheurs à penser à des approches d'extraction des données à partir du Web, d'où l'apparition des adaptateurs. Un adaptateur est basé sur un ensemble des règles d'extraction définissant l'emplacement des données dans le document à extraire. Plusieurs outils existent pour la construction de ces règles. Notre travail s'intéresse au problème de l'extraction de données à partir du Web. Dans ce document, nous proposons une méthode d'extraction des données à partir du Web basée sur l'apprentissage machine pour la construction des règles d'extraction. Les résultats de l'extraction de notre approche démontrent une importance en matière de précision d'extraction et une meilleure performance dans le processus d'apprentissage. L'utilisation de notre outil dans une application d'interrogation de sources de données a permis de répondre aux besoins des utilisateurs d'une manière très simple et automatique. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : extraction, adaptateurs, règles d'extraction, apprentissage machine, Web, applications Web. Apprentissage automatique Extraction de l'information Web mining
19	Benefits of the application of web-mining methods and techniques for the field of analytical customer relationship management of the marketing function in a knowledge management perspective Ertz, Myriam 12 1900 (has links) (PDF) Le Web Mining (WM) reste une technologie relativement méconnue. Toutefois, si elle est utilisée adéquatement, elle s'avère être d'une grande utilité pour l'identification des profils et des comportements des clients prospects et existants, dans un contexte internet. Les avancées techniques du WM améliorent grandement le volet analytique de la Gestion de la Relation Client (GRC). Cette étude suit une approche exploratoire afin de déterminer si le WM atteint, à lui seul, tous les objectifs fondamentaux de la GRC, ou le cas échéant, devrait être utilisé de manière conjointe avec la recherche marketing traditionnelle et les méthodes classiques de la GRC analytique (GRCa) pour optimiser la GRC, et de fait le marketing, dans un contexte internet. La connaissance obtenue par le WM peut ensuite être administrée au sein de l'organisation dans un cadre de Gestion de la Connaissance (GC), afin d'optimiser les relations avec les clients nouveaux et/ou existants, améliorer leur expérience client et ultimement, leur fournir de la meilleure valeur. Dans un cadre de recherche exploratoire, des entrevues semi-structurés et en profondeur furent menées afin d'obtenir le point de vue de plusieurs experts en (web) data rnining. L'étude révéla que le WM est bien approprié pour segmenter les clients prospects et existants, pour comprendre les comportements transactionnels en ligne des clients existants et prospects, ainsi que pour déterminer le statut de loyauté (ou de défection) des clients existants. Il constitue, à ce titre, un outil d'une redoutable efficacité prédictive par le biais de la classification et de l'estimation, mais aussi descriptive par le biais de la segmentation et de l'association. En revanche, le WM est moins performant dans la compréhension des dimensions sous-jacentes, moins évidentes du comportement client. L'utilisation du WM est moins appropriée pour remplir des objectifs liés à la description de la manière dont les clients existants ou prospects développent loyauté, satisfaction, défection ou attachement envers une enseigne sur internet. Cet exercice est d'autant plus difficile que la communication multicanale dans laquelle évoluent les consommateurs a une forte influence sur les relations qu'ils développent avec une marque. Ainsi le comportement en ligne ne serait qu'une transposition ou tout du moins une extension du comportement du consommateur lorsqu'il n'est pas en ligne. Le WM est également un outil relativement incomplet pour identifier le développement de la défection vers et depuis les concurrents ainsi que le développement de la loyauté envers ces derniers. Le WM nécessite toujours d'être complété par la recherche marketing traditionnelle afin d'atteindre ces objectives plus difficiles mais essentiels de la GRCa. Finalement, les conclusions de cette recherche sont principalement dirigées à l'encontre des firmes et des gestionnaires plus que du côté des clients-internautes, car ces premiers plus que ces derniers possèdent les ressources et les processus pour mettre en œuvre les projets de recherche en WM décrits. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : Web mining, Gestion de la connaissance, Gestion de la relation client, Données internet, Comportement du consommateur, Forage de données, Connaissance du consommateur Gestion des connaissances Gestion de la relation client Web mining
20	Web Intelligence for Scaling Discourse of Organizations January 2016 (has links) abstract: Internet and social media devices created a new public space for debate on political and social topics (Papacharissi 2002; Himelboim 2010). Hotly debated issues span all spheres of human activity; from liberal vs. conservative politics, to radical vs. counter-radical religious debate, to climate change debate in scientific community, to globalization debate in economics, and to nuclear disarmament debate in security. Many prominent ’camps’ have emerged within Internet debate rhetoric and practice (Dahlberg, n.d.). In this research I utilized feature extraction and model fitting techniques to process the rhetoric found in the web sites of 23 Indonesian Islamic religious organizations, later with 26 similar organizations from the United Kingdom to profile their ideology and activity patterns along a hypothesized radical/counter-radical scale, and presented an end-to-end system that is able to help researchers to visualize the data in an interactive fashion on a time line. The subject data of this study is the articles downloaded from the web sites of these organizations dating from 2001 to 2011, and in 2013. I developed algorithms to rank these organizations by assigning them to probable positions on the scale. I showed that the developed Rasch model fits the data using Andersen’s LR-test (likelihood ratio). I created a gold standard of the ranking of these organizations through an expertise elicitation tool. Then using my system I computed expert-to-expert agreements, and then presented experimental results comparing the performance of three baseline methods to show that the Rasch model not only outperforms the baseline methods, but it was also the only system that performs at expert-level accuracy. I developed an end-to-end system that receives list of organizations from experts, mines their web corpus, prepare discourse topic lists with expert support, and then ranks them on scales with partial expert interaction, and finally presents them on an easy to use web based analytic system. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2016 Computer science ranking social scaling web mining

Search results