Global ETD Search

181	Meta-učení v oblasti dolování dat / Meta-Learning in the Area of Data Mining Kučera, Petr January 2013 (has links) This paper describes the use of meta-learning in the area of data mining. It describes the problems and tasks of data mining where meta-learning can be applied, with a focus on classification. It provides an overview of meta-learning techniques and their possible application in data mining, especially model selection. It describes design and implementation of meta-learning system to support classification tasks in data mining. The system uses statistics and information theory to characterize data sets stored in the meta-knowledge base. The meta-classifier is created from the base and predicts the most suitable model for the new data set. The conclusion discusses results of the experiments with more than 20 data sets representing clasification tasks from different areas and suggests possible extensions of the project.
182	Une approche pour l'évaluation des systèmes d'aide à la décision mobiles basés sur le processus d'extraction des connaissances à partir des données : application dans le domaine médical / An approach for the evaluation of mobile decision support systems based on a knowledge discovery from data process : application in the medical field Borcheni, Emna 27 March 2017 (has links) Dans ce travail, on s’intéresse aux Systèmes d’Aide à la Décision Mobiles qui sont basés sur le processus d’Extraction des Connaissances à partir des Données (SADM/ECD). Nous contribuons non seulement à l'évaluation de ces systèmes, mais aussi à l'évaluation dans le processus d’ECD lui-même. L'approche proposée définit un module de support d'évaluation pour chaque module composant le processus d’ECD en se basant sur des modèles de qualité. Ces modules évaluent non seulement la qualité d'utilisation de chaque module logiciel composant le processus d’ECD, mais aussi d'autres critères qui reflètent les objectifs de chaque module de l’ECD. Notre objectif est d'aider les évaluateurs à détecter des défauts le plus tôt possible pour améliorer la qualité de tous les modules qui constituent un SADM/ECD. Nous avons aussi pris en compte le changement de contexte d'utilisation en raison de la mobilité. De plus, nous avons proposé un système d’aide à l’évaluation, nommé CEVASM : Système d’aide à l’évaluation basée sur le contexte pour les SADM, qui contrôle et mesure tous les facteurs de qualité proposés. Finalement, l'approche que nous proposons est appliquée pour l'évaluation des modules d'un SADM/ECD pour la lutte contre les infections nosocomiales à l'hôpital Habib Bourguiba de Sfax, Tunisie. Lors de l'évaluation, nous nous sommes basés sur le processus d'évaluation ISO/IEC 25040. L'objectif est de pouvoir valider, a priori, l'outil d'évaluation réalisé (CEVASM) et par conséquent, l'approche proposée. / In this work, we are interested in Mobile Decision support systems (MDSS), which are based on the Knowledge Discovery from Data process (MDSS/KDD). Our work is dealing with the evaluation of these systems, but also to the evaluation in the KDD process itself. The proposed approach appends an evaluation support module for each software module composing the KDD process based on quality models. The proposed evaluation support modules allow to evaluate not only the quality in use of each module composing the KDD process, but also other criteria that reflect the objectives of each KDD module. Our main goal is to help evaluators to detect defects as early as possible in order to enhance the quality of all the modules that constitute a MDSS/KDD. We have also presented a context-based method that takes into account the change of context of use due to mobility. In addition, we have proposed an evaluation support system that monitors and measures all the proposed criteria. Furthermore, we present the implementation of the proposed approach. These developments concern mainly the proposed evaluation tool: CEVASM: Context-based EVAluation support System for MDSS. Finally, the proposed approach is applied for the evaluation of the modules of a MDSS/KDD for the fight against nosocomial infections, in Habib Bourguiba hospital in Sfax, Tunisia. For every module in KDD, we are interested with the phase of evaluation. We follow the evaluation process based on the ISO/IEC 25040 standard. The objective is to be able to validate, a priori, the realized evaluation tool (CEVASM) and consequently, the proposed approach. Système d’aide à la décision Evaluation Contexte d’utilisation Mobilité Decision support system Knowledge discovery from data Evaluation Context of use Mobility
183	Získávání znalostí z marketingových dat / Knowledge discovery in marketing data Kazárová, Marie January 2020 (has links) Data mining techniques are used by companies to gain competitive advantages. In today's marketplace, they are also used by marketers mainly for personalization of advertising and for maintaining long-term relationship with customers. Progress in knowledge discovery in databases and availability of computational power comes not only with positive impact, but also with challenges. The practical part of the thesis aims to explore and describe data mining techniques applied to e-commerce dataset. Dataset consists of transaction and web analytics data. The goal of experimental application aims to make a selection of users who most probably react to a marketing communication and to identify the factors which influence them. Target segment of users is obtained through the use of data mining technique clustering. The classification model uses decision tree algorithm to predict whether users submit transaction with an accuracy of 75%. The results are useful for optimization of marketing and business strategy.
184	Metody extrakce informace z textových dokumentů / Methods for Information Extraction in Text Documents Sychra, Tomáš January 2008 (has links) Knowledge discovery in text documents is part of data mining. However, text documents have different properties in comparison to regular databases. This project contains an overview of methods for knowledge discovery in text documents. The most frequently used task in this area is document classification. Various approaches for text classification will be described. Finally, I will present algorithm Winnow that should perform better than any other algorithm for classification. There is a description of Winnow implementation and an overview of experimental results.
185	Vytvoření nových klasifikačních modulů v systému pro dolování z dat na platformě NetBeans / Creation of New Clasification Units in Data Mining System on NetBeans Platform Kmoščák, Ondřej January 2009 (has links) This diploma thesis deals with the data mining and the creation of data mining unit for data mining system, which is beeing developed at FIT. This is a client application consisting of a kernel and its graphical user interface and independent mining modules. The application uses support of Oracle Data Mining. The data mining system is implemented in Java language and its graphical user interface is built on NetBeans platform. The content of this work will be the introduction into the issue of knowledge discovery and then the presentation of the chosen Bayesian classification method, for which there will subsequently be implemented the stand-alone data mining module. Furthermore, the implementation of this module will be described.
186	Dolovací moduly systému pro dolování z dat na platformě NetBeans / Mining Modules of Data Mining System on NetBeans Platform Henkl, Tomáš January 2009 (has links) The master's thesis deals with the knowledge discover in databases and with the extending of the data mining systems in the Oracle environment developed at the VUT FIT. The system kernel conception incorporates an interface that enables the adding of data mining modules. The objective of the thesis is to learn this interface and implement and embed the data mining module for decision-tree classification into the application. In addition, the thesis compares the application with similar commercial product SAS Enterprise Miner
187	SUSTAINABILITY IMPLEMENTATION IN FASHION THROUGH KNOWLEDGE DISCOVERY: AN EXPLORATORY QUALITATIVE STUDY Robles, Julia 23 June 2023 (has links) No description available. Climate Change Design Environmental Management Information Systems Instructional Design Management Sustainability Systems Design
188	Data Mining with Newton's Method. Cloyd, James Dale 01 December 2002 (has links) (PDF) Capable and well-organized data mining algorithms are essential and fundamental to helpful, useful, and successful knowledge discovery in databases. We discuss several data mining algorithms including genetic algorithms (GAs). In addition, we propose a modified multivariate Newton's method (NM) approach to data mining of technical data. Several strategies are employed to stabilize Newton's method to pathological function behavior. NM is compared to GAs and to the simplex evolutionary operation algorithm (EVOP). We find that GAs, NM, and EVOP all perform efficiently for well-behaved global optimization functions with NM providing an exponential improvement in convergence rate. For local optimization problems, we find that GAs and EVOP do not provide the desired convergence rate, accuracy, or precision compared to NM for technical data. We find that GAs are favored for their simplicity while NM would be favored for its performance. evolutionary operations knowledge discovery in databases neural networks simplex evop maximum likelihood estimation non-linear regression robust regression Genetic algorithms Computer Sciences Physical Sciences and Mathematics
189	RECOMMENDATION SYSTEMS IN SOCIAL NETWORKS Behafarid Mohammad Jafari (15348268) 18 May 2023 (has links) <p> The dramatic improvement in information and communication technology (ICT) has made an evolution in learning management systems (LMS). The rapid growth in LMSs has caused users to demand more advanced, automated, and intelligent services. CourseNetworking is a next-generation LMS adopting machine learning to add personalization, gamification, and more dynamics to the system. This work tries to come up with two recommender systems that can help improve CourseNetworking services. The first one is a social recommender system helping CourseNetworking to track user interests and give more relevant recommendations. Recently, graph neural network (GNN) techniques have been employed in social recommender systems due to their high success in graph representation learning, including social network graphs. Despite the rapid advances in recommender systems performance, dealing with the dynamic property of the social network data is one of the key challenges that is remained to be addressed. In this research, a novel method is presented that provides social recommendations by incorporating the dynamic property of social network data in a heterogeneous graph by supplementing the graph with time span nodes that are used to define users long-term and short-term preferences over time. The second service that is proposed to add to Rumi services is a hashtag recommendation system that can help users label their posts quickly resulting in improved searchability of content. In recent years, several hashtag recommendation methods are proposed and developed to speed up processing of the texts and quickly find out the critical phrases. The methods use different approaches and techniques to obtain critical information from a large amount of data. This work investigates the efficiency of unsupervised keyword extraction methods for hashtag recommendation and recommends the one with the best performance to use in a hashtag recommender system. </p> Knowledge representation and reasoning Natural language processing Data mining and knowledge discovery Graph, social and multimedia data Recommender systems Recommender Systems Graph neural networks (GNNs) Natural Language Processing (NLP) Machine Learning
190	Models and Representation Learning Mechanisms for Graph Data Susheel Suresh (14228138) 15 December 2022 (has links) <p>Graph representation learning (GRL) has been increasing used to model and understand data from a wide variety of complex systems spanning social, technological, bio-chemical and physical domains. GRL consists of two main components (1) a parametrized encoder that provides representations of graph data and (2) a learning process to train the encoder parameters. Designing flexible encoders that capture the underlying invariances and characteristics of graph data are crucial to the success of GRL. On the other hand, the learning process drives the quality of the encoder representations and developing principled learning mechanisms are vital for a number of growing applications in self-supervised, transfer and federated learning settings. To this end, we propose a suite of models and learning algorithms for GRL which form the two main thrusts of this dissertation.</p> <p><br></p> <p>In Thrust I, we propose two novel encoders which build upon on a widely popular GRL encoder class called graph neural networks (GNNs). First, we empirically study the prediction performance of current GNN based encoders when applied to graphs with heterogeneous node mixing patterns using our proposed notion of local assortativity. We find that GNN performance in node prediction tasks strongly correlates with our local assortativity metric---thereby introducing a limit. We propose to transform the input graph into a computation graph with proximity and structural information as distinct types of edges. We then propose a novel GNN based encoder that operates on this computation graph and adaptively chooses between structure and proximity information. Empirically, adopting our transformation and encoder framework leads to improved node classification performance compared to baselines in real-world graphs that exhibit diverse mixing.</p> <p>Secondly, we study the trade-off between expressivity and efficiency of GNNs when applied to temporal graphs for the task of link ranking. We develop an encoder that incorporates a labeling approach designed to allow for efficient inference over the candidate set jointly, while provably boosting expressivity. We also propose to optimize a list-wise loss for improved ranking. With extensive evaluation on real-world temporal graphs, we demonstrate its improved performance and efficiency compared to baselines.</p> <p><br></p> <p>In Thrust II, we propose two principled encoder learning mechanisms for challenging and realistic graph data settings. First, we consider a scenario where only limited or even no labelled data is available for GRL. Recent research has converged on graph contrastive learning (GCL), where GNNs are trained to maximize the correspondence between representations of the same graph in its different augmented forms. However, we find that GNNs trained by traditional GCL often risk capturing redundant graph features and thus may be brittle and provide sub-par performance in downstream tasks. We then propose a novel principle, termed adversarial-GCL (AD-GCL), which enables GNNs to avoid capturing redundant information during the training by optimizing adversarial graph augmentation strategies used in GCL. We pair AD-GCL with theoretical explanations and design a practical instantiation based on trainable edge-dropping graph augmentation. We experimentally validate AD-GCL by comparing with state-of-the-art GCL methods and achieve performance gains in semi-supervised, unsupervised and transfer learning settings using benchmark chemical and biological molecule datasets. </p> <p>Secondly, we consider a scenario where graph data is silo-ed across clients for GRL. We focus on two unique challenges encountered when applying distributed training to GRL: (i) client task heterogeneity and (ii) label scarcity. We propose a novel learning framework called federated self-supervised graph learning (FedSGL), which first utilizes a self-supervised objective to train GNNs in a federated fashion across clients and then, each client fine-tunes the obtained GNNs based on its local task and available labels. Our framework enables the federated GNN model to extract patterns from the common feature (attribute and graph topology) space without the need of labels or being biased by heterogeneous local tasks. Extensive empirical study of FedSGL on both node and graph classification tasks yields fruitful insights into how the level of feature / task heterogeneity, the adopted federated algorithm and the level of label scarcity affects the clients’ performance in their tasks.</p> Data mining and knowledge discovery Graph, social and multimedia data Deep learning Neural networks Semi- and unsupervised learning Graph Neural Networks (GNNs) Deep Learning Self Supervised Learning Federated Learning frameworks

Search results