• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 43
  • 11
  • 8
  • 4
  • 4
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 94
  • 94
  • 33
  • 27
  • 22
  • 14
  • 14
  • 12
  • 11
  • 10
  • 10
  • 10
  • 10
  • 10
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Facilitating Knowledge Discovery by Mining the Content and Link Structure of the Web

Qin, Jialun January 2006 (has links)
Given the vast amount of online information covering almost all aspects of human endeavor, the Internet, especially the Web, is clearly a fertile ground for data mining research from which to extract valuable knowledge. Web mining is the application of data mining techniques to extract knowledge from Web data, including Web documents, Web hyperlink structure, and Web usage logs.Traditional Web mining research has been mainly focused on addressing the information overload problem. Many information retrieval (IR) and artificial intelligence (AI) techniques have been adopted or developed to identify relevant information from the Web to meet users' specific information needs. However, most existing studies do not fully explore the social and behavioral aspects of the Web. Thus, the primary goal of this dissertation is to develop an integrated research framework that extends traditional Web mining methodologies to fully explore the technical, social, and behavioral aspects of Web knowledge discovery.My dissertation framework is composed of technical and social/behavioral components. In the technical component of my dissertation, a set of domain specific Web collection building, Web content and link structure mining, and Web knowledge presentation techniques were developed. These techniques were tested in a series of case studies to demonstrate their effectiveness and efficiency in facilitating knowledge discovery in various domains.The social/behavioral component of my dissertation is to explore the application of Web mining technology as a new means to study the social interactions and behavior of Web content providers and users. Several case studies were conducted to extract knowledge on covert organizations' resource allocation plans, information management policies, and technical sophistication using Web mining techniques. Such knowledge would be very difficult to obtain through other means.The major contributions of this dissertation are twofold. First, it proposed a set of new Web mining techniques that can help facilitate knowledge discovery in various domains. Second, it demonstrated the effectiveness and efficiency of applying Web mining techniques in extracting social and behavioral knowledge in different contexts.
52

A New Reactive Method For Processing Web Usage Data

Bayir, Murat Ali 01 July 2007 (has links) (PDF)
In this thesis, a new reactive session reconstruction method &#039 / Smart-SRA&#039 / is introduced. Web usage mining is a type of web mining, which exploits data mining techniques to discover valuable information from navigations of Web users. As in classical data mining, data processing and pattern discovery are the main issues in web usage mining. The first phase of the web usage mining is the data processing phase including session reconstruction. Session reconstruction is the most important task of web usage mining since it directly affects the quality of the extracted frequent patterns at the final step, significantly. Session reconstruction methods can be classified into two categories, namely &#039 / reactive&#039 / and &#039 / proactive&#039 / with respect to the data source and the data processing time. If the user requests are processed after the server handles them, this technique is called as &lsquo / reactive&rsquo / , while in &lsquo / proactive&rsquo / strategies this processing occurs during the interactive browsing of the web site. Smart-SRA is a reactive session reconstruction techique, which uses web log data and the site topology. In order to compare Smart-SRA with previous reactive methods, a web agent simulator has been developed. Our agent simulator models behavior of web users and generates web user navigations as well as the log data kept by the web server. In this way, the actual user sessions will be known and the successes of different techniques can be compared. In this thesis, it is shown that the sessions generated by Smart-SRA are more accurate than the sessions constructed by previous heuristics.
53

Minería y Personalización de un Sitio Web para Celulares

Villar Escobar, Osvaldo Pablo January 2007 (has links)
No description available.
54

Automatic Web widgets prediction for Web 2.0 access technologies

Chen, Alex Qiang January 2013 (has links)
The World Wide Web (Web) has evolved from a collection of static pages that need reloading every time the content changes, into dynamic pages where parts of the page updates independently, without reloading it. As such, users are required to work with dynamic pages with components that react to events either from human interaction or machine automation. Often elderly and visually impaired users are the most disadvantaged when dealing with this form of interaction. Operating widgets require the user to have the conceptual design knowledge of the widget to complete the task. Users must have prior experience with the widget or have to learn to operate it independently, because often no user documentation is available. An automated Widget Prediction Framework (WPF) is proposed to address the issues discussed. It is a pre-emptive approach that predicts different types of widget and their locations in the page. Widgets with similar characteristics and functionalities are categorised based on a definition provided by widget design pattern libraries. Some design patterns are more loosely defined than others, and this causes confusion and ambiguity when identifying them. A formal method to model widgets based on a Widget Ontology was developed. The paradigm of the ontology provides a framework for developers to communicate their ideas, while reducing ambiguity between different types of widget. A Widget Prediction System (WPS) was developed using the concepts of the WPF. To select the types of widget for WPS evaluation, a widget popularity investigation was conducted. Seven of the most popular widgets from the investigation, done across fifty Websites, were selected. To demonstrate how WPF can be applied to predict widgets, fifty websites were used to evaluate the feasibility of the approach using WPS. On average, WPS achieved 61.98% prediction accuracy with two of the widgets > 84% accuracy. These results demonstrated the feasibility of the framework as the backend for tools that support elderly or visually impaired users.
55

Web Usage Mining / Web Usage Mining

Benkovská, Petra January 2007 (has links)
General characteristic of web mining including methodology and procedures incorporated into this term. Relation to other areas (data mining, artificial intelligence, statistics, databases, internet technologies, management etc.) Web usage mining - data sources, data pre-processing, characterization of analytical methods and tools, interpretation of outputs (results), and possible areas of usage including examples. Suggestion of solution method, realization and a concrete example's outputs interpretation while using above mentioned methods of web usage mining.
56

Analýza WWW stránek na základě klíčových slov / Analysis of WWW pages on the basis of keywords

Doležal, Petr January 2013 (has links)
This thesis is about keyword research for optimization of web pages. The thesis is mainly focused on on-page factors which are connected with keywords and key phrases. The goal of the thesis is design of methodics for keyword research and its evaluation on the basis of practical usage. The design of methodics is based on theoretical and practical knowledge and experience and it is the main contribution of the thesis. Usage of the keyword research is very wide, it forms the basis of creation of a web and content of the web. Everybody should learn the methodics, who wants to promote his web in competitive area, who wants to get to know his customers better and to generate higher profit, because the methodics is beneficial for all these and other reasons. The first part is theoretical basis. It shows the position of keyword research in SEO and in web mining. Further the important terms connected with keywords are defined. Keyword properties and methods to gaining keywords for website optimization are described either. Further various targets and metrics for measuring these targets are introduced. Next the most important tools for keyword research and characteristics of the two most used search engines for the Czech Republic (Google and Seznam) are described. The second part includes the design of the methodics for keyword research which consists of data collection, website optimization and monitoring of changes. Individual processes of methodis are described in practical experiments for particular web pages. The effectiveness of the methodics is evaluated at the end.
57

A Study on Web Search based on Coordinate Relationships / 同位関係に基づくウェブ検索に関する研究

Meng, Zhao 23 September 2016 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第20030号 / 情博第625号 / 新制||情||109(附属図書館) / 33126 / 京都大学大学院情報学研究科社会情報学専攻 / (主査)教授 田中 克己, 教授 吉川 正俊, 教授 黒橋 禎夫 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
58

Ontology-Based SemanticWeb Mining Challenges : A Literature Review

March, Christopher January 2023 (has links)
The semantic web is an extension of the current web that provides a standardstructure for data representation and reasoning, allowing content to be readable for both humans and machines in a form known as ontological knowledgebases. The goal of the Semantic Web is to be used in large-scale technologies or systems such as search engines, healthcare systems, and social mediaplatforms. Some challenges may deter further progress in the development ofthe Semantic Web and the associated web mining processes. In this reviewpaper, an overview of Semantic Web mining will examine and analyze challenges with data integration, dynamic knowledge-based methods, efficiencies,and data mining algorithms regarding ontological approaches. Then, a reviewof recent solutions to these challenges such as clustering, classification, association rule mining, and ontological building aides that overcome the challengeswill be discussed and analyzed.
59

Effective web log mining and online navigational pattern prediction

Guerbas, A., Addam, O., Zaarour, O., Nagi, Mohamad, Elhajj, Ahmad, Ridley, Mick J., Alhajj, R. 09 1900 (has links)
No / Accurate web log mining results and efficient online navigational pattern prediction are undeniably crucial for tuning up websites and consequently helping in visitors' retention. Like any other data mining task, web log mining starts with data cleaning and preparation and it ends up discovering some hidden knowledge which cannot be extracted using conventional methods. In order for this process to yield good results it has to rely on some good quality input data. Therefore, more focus in this process should be on data cleaning and pre-processing. On the other hand, one of the challenges facing online prediction is scalability. As a result any improvement in the efficiency of online prediction solutions is more than necessary. As a response to the aforementioned concerns we are proposing an enhancement to the web log mining process and to the online navigational pattern prediction. Our contribution contains three different components. First, we are proposing a refined time-out based heuristic for session identification. Second, we are suggesting the usage of a specific density based algorithm for navigational pattern discovery. Finally, a new approach for efficient online prediction is also suggested. The conducted experiments demonstrate the applicability and effectiveness of the proposed approach. (C) 2013 Elsevier B.V. All rights reserved.
60

Resource-Bounded Information Acquisition and Learning

Kanani, Pallika H 01 May 2012 (has links)
In many scenarios it is desirable to augment existing data with information acquired from an external source. For example, information from the Web can be used to fill missing values in a database or to correct errors. In many machine learning and data mining scenarios, acquiring additional feature values can lead to improved data quality and accuracy. However, there is often a cost associated with such information acquisition, and we typically need to operate under limited resources. In this thesis, I explore different aspects of Resource-bounded Information Acquisition and Learning. The process of acquiring information from an external source involves multiple steps, such as deciding what subset of information to obtain, locating the documents that contain the required information, acquiring relevant documents, extracting the specific piece of information, and combining it with existing information to make useful decisions. The problem of Resource-bounded Information Acquisition (RBIA) involves saving resources at each stage of the information acquisition process. I explore four special cases of the RBIA problem, propose general principles for efficiently acquiring external information in real-world domains, and demonstrate their effectiveness using extensive experiments. For example, in some of these domains I show how interdependency between fields or records in the data can also be exploited to achieve cost reduction. Finally, I propose a general framework for RBIA, that takes into account the state of the database at each point of time, dynamically adapts to the results of all the steps in the acquisition process so far, as well as the properties of each step, and carries them out striving to acquire most information with least amount resources.

Page generated in 0.084 seconds