• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 38
  • 5
  • 3
  • 3
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 69
  • 69
  • 59
  • 39
  • 19
  • 13
  • 9
  • 8
  • 8
  • 8
  • 7
  • 6
  • 6
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Domain-concept mining an efficient on-demand data mining approach /

Mahamaneerat, Wannapa Kay, Shyu, Chi-Ren. January 2008 (has links)
Title from PDF of title page (University of Missouri--Columbia, viewed on February 24, 2010). The entire thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file; a non-technical public abstract appears in the public.pdf file. Dissertation advisor: Dr. Chi-Ren Shyu. Vita. Includes bibliographical references.
32

Recommending new items to customers : A comparison between Collaborative Filtering and Association Rule Mining / Rekommendera nya produkter till kunder : En jämförelsestudie mellan Collaborative Filtering och Association Rule Mining

Sohlberg, Henrik January 2015 (has links)
E-commerce is an ever growing industry as the internet infrastructure continues to evolve. The benefits from a recommendation system to any online retail store are several. It can help customers to find what they need as well as increase sales by enabling accurate targeted promotions. Among many techniques that can form recommendation systems, this thesis compares Collaborative Filtering against Association Rule Mining, both implemented in combination with clustering. The suggested implementations are designed with the cold start problem in mind and are evaluated with a data set from an online retail store which sells clothing. The results indicate that Collaborative Filtering is the preferable technique while associated rules may still offer business value to stakeholders. However, the strength of the results is undermined by the fact that only a single data set was used. / E-handel är en växande marknad i takt med att Internet utvecklas samtidigt som antalet användare ständigt ökar. Antalet fördelar från rekommendationssytem som e-butiker kan dra nytta av är flera. Samtidigt som det kan hjälpa kunder att hitta vad de letar efter kan det utgöra underlag för riktade kampanjer, något som kan öka försäljning. Det finns många olika tekniker som rekommendationssystem kan vara byggda utifrån. Detta examensarbete ställer fokus på de två teknikerna Collborative Filtering samt Association Rule Mining och jämför dessa sinsemellan. Båda metoderna kombinerades med klustring och utformades för att råda bot på kallstartsproblemet. De två föreslagna implementationerna testades sedan mot en riktig datamängd från en e-butik med kläder i sitt sortiment. Resultaten tyder på att Collborative Filtering är den överlägsna tekniken samtidigt som det fortfarande finns ett värde i associeringsregler. Att dra generella slutsatser försvåras dock av att enbart en datamängd användes.
33

Understanding usage of Volvo trucks

Dahl, Oskar, Johansson, Fredrik January 2019 (has links)
Trucks are designed, configured and marketed for various working environments. There lies a concern whether trucks are used as intended by the manufacturer, as usage may impact the longevity, efficiency and productivity of the trucks. In this thesis we propose a framework divided into two separate parts, that aims to extract costumers’ driving behaviours from Logged Vehicle Data (LVD) in order to a): evaluate whether they align with so-called Global Transport Application (GTA) parameters and b): evaluate the usage in terms of performance. Gaussian mixture model (GMM) is employed to cluster and classify various driving behaviors. Association rule mining was applied on the categorized clusters to validate that the usage follow GTA configuration. Furthermore, Correlation Coefficient (CC) was used to find linear relationships between usage and performance in terms of Fuel Consumption (FC). It is found that the vast majority of the trucks seemingly follow GTA parameters, thus used as marketed. Likewise, the fuel economy was found to be linearly dependent with drivers’ various performances. The LVD lacks detail, such as Global Positioning System (GPS) information, needed to capture the usage in such a way that more definitive conclusions can be drawn. / <p>This thesis was later conducted as a scientific paper and was submit- ted to the conference of ICIMP, 2020. The publication was accepted the 23th of September (2019), and will be presented in January, 2020.</p>
34

A Bicluster-based Rule Mining Framework for the Identification of Disease-causal Gene Variants

Bhatnagar, Surbhi January 2021 (has links)
No description available.
35

Ontology-Based SemanticWeb Mining Challenges : A Literature Review

March, Christopher January 2023 (has links)
The semantic web is an extension of the current web that provides a standardstructure for data representation and reasoning, allowing content to be readable for both humans and machines in a form known as ontological knowledgebases. The goal of the Semantic Web is to be used in large-scale technologies or systems such as search engines, healthcare systems, and social mediaplatforms. Some challenges may deter further progress in the development ofthe Semantic Web and the associated web mining processes. In this reviewpaper, an overview of Semantic Web mining will examine and analyze challenges with data integration, dynamic knowledge-based methods, efficiencies,and data mining algorithms regarding ontological approaches. Then, a reviewof recent solutions to these challenges such as clustering, classification, association rule mining, and ontological building aides that overcome the challengeswill be discussed and analyzed.
36

Novel Algorithms for Cross-Ontology Multi-Level Data Mining

Manda, Prashanti 15 December 2012 (has links)
The wide spread use of ontologies in many scientific areas creates a wealth of ontologyannotated data and necessitates the development of ontology-based data mining algorithms. We have developed generalization and mining algorithms for discovering cross-ontology relationships via ontology-based data mining. We present new interestingness measures to evaluate the discovered cross-ontology relationships. The methods presented in this dissertation employ generalization as an ontology traversal technique for the discovery of interesting and informative relationships at multiple levels of abstraction between concepts from different ontologies. The generalization algorithms combine ontological annotations with the structure and semantics of the ontologies themselves to discover interesting crossontology relationships. The first algorithm uses the depth of ontological concepts as a guide for generalization. The ontology annotations are translated to higher levels of abstraction one level at a time accompanied by incremental association rule mining. The second algorithm conducts a generalization of ontology terms to all their ancestors via transitive ontology relations and then mines cross-ontology multi-level association rules from the generalized transactions. Our interestingness measures use implicit knowledge conveyed by the relation semantics of the ontologies to capture the usefulness of cross-ontology relationships. We describe the use of information theoretic metrics to capture the interestingness of cross-ontology relationships and the specificity of ontology terms with respect to an annotation dataset. Our generalization and data mining agorithms are applied to the Gene Ontology and the postnatal Mouse Anatomy Ontology. The results presented in this work demonstrate that our generalization algorithms and interestingness measures discover more interesting and better quality relationships than approaches that do not use generalization. Our algorithms can be used by researchers and ontology developers to discover inter-ontology connections. Additionally, the cross-ontology relationships discovered using our algorithms can be used by researchers to understand different aspects of entities that interest them.
37

SQL Implementation of Value Reduction with Multiset Decision Tables

Chen, Chen 16 May 2014 (has links)
No description available.
38

Automating debugging through data mining / Automatisering av felsökning genom data mining

Thun, Julia, Kadouri, Rebin January 2017 (has links)
Contemporary technological systems generate massive quantities of log messages. These messages can be stored, searched and visualized efficiently using log management and analysis tools. The analysis of log messages offer insights into system behavior such as performance, server status and execution faults in web applications. iStone AB wants to explore the possibility to automate their debugging process. Since iStone does most parts of their debugging manually, it takes time to find errors within the system. The aim was therefore to find different solutions to reduce the time it takes to debug. An analysis of log messages within access – and console logs were made, so that the most appropriate data mining techniques for iStone’s system would be chosen. Data mining algorithms and log management and analysis tools were compared. The result of the comparisons showed that the ELK Stack as well as a mixture between Eclat and a hybrid algorithm (Eclat and Apriori) were the most appropriate choices. To demonstrate their feasibility, the ELK Stack and Eclat were implemented. The produced results show that data mining and the use of a platform for log analysis can facilitate and reduce the time it takes to debug. / Dagens system genererar stora mängder av loggmeddelanden. Dessa meddelanden kan effektivt lagras, sökas och visualiseras genom att använda sig av logghanteringsverktyg. Analys av loggmeddelanden ger insikt i systemets beteende såsom prestanda, serverstatus och exekveringsfel som kan uppkomma i webbapplikationer. iStone AB vill undersöka möjligheten att automatisera felsökning. Eftersom iStone till mestadels utför deras felsökning manuellt så tar det tid att hitta fel inom systemet. Syftet var att därför att finna olika lösningar som reducerar tiden det tar att felsöka. En analys av loggmeddelanden inom access – och konsolloggar utfördes för att välja de mest lämpade data mining tekniker för iStone’s system. Data mining algoritmer och logghanteringsverktyg jämfördes. Resultatet av jämförelserna visade att ELK Stacken samt en blandning av Eclat och en hybrid algoritm (Eclat och Apriori) var de lämpligaste valen. För att visa att så är fallet så implementerades ELK Stacken och Eclat. De framställda resultaten visar att data mining och användning av en plattform för logganalys kan underlätta och minska den tid det tar för att felsöka.
39

Improving RDF data with data mining

Abedjan, Ziawasch January 2014 (has links)
Linked Open Data (LOD) comprises very many and often large public data sets and knowledge bases. Those datasets are mostly presented in the RDF triple structure of subject, predicate, and object, where each triple represents a statement or fact. Unfortunately, the heterogeneity of available open data requires significant integration steps before it can be used in applications. Meta information, such as ontological definitions and exact range definitions of predicates, are desirable and ideally provided by an ontology. However in the context of LOD, ontologies are often incomplete or simply not available. Thus, it is useful to automatically generate meta information, such as ontological dependencies, range definitions, and topical classifications. Association rule mining, which was originally applied for sales analysis on transactional databases, is a promising and novel technique to explore such data. We designed an adaptation of this technique for min-ing Rdf data and introduce the concept of “mining configurations”, which allows us to mine RDF data sets in various ways. Different configurations enable us to identify schema and value dependencies that in combination result in interesting use cases. To this end, we present rule-based approaches for auto-completion, data enrichment, ontology improvement, and query relaxation. Auto-completion remedies the problem of inconsistent ontology usage, providing an editing user with a sorted list of commonly used predicates. A combination of different configurations step extends this approach to create completely new facts for a knowledge base. We present two approaches for fact generation, a user-based approach where a user selects the entity to be amended with new facts and a data-driven approach where an algorithm discovers entities that have to be amended with missing facts. As knowledge bases constantly grow and evolve, another approach to improve the usage of RDF data is to improve existing ontologies. Here, we present an association rule based approach to reconcile ontology and data. Interlacing different mining configurations, we infer an algorithm to discover synonymously used predicates. Those predicates can be used to expand query results and to support users during query formulation. We provide a wide range of experiments on real world datasets for each use case. The experiments and evaluations show the added value of association rule mining for the integration and usability of RDF data and confirm the appropriateness of our mining configuration methodology. / Linked Open Data (LOD) umfasst viele und oft sehr große öffentlichen Datensätze und Wissensbanken, die hauptsächlich in der RDF Triplestruktur bestehend aus Subjekt, Prädikat und Objekt vorkommen. Dabei repräsentiert jedes Triple einen Fakt. Unglücklicherweise erfordert die Heterogenität der verfügbaren öffentlichen Daten signifikante Integrationsschritte bevor die Daten in Anwendungen genutzt werden können. Meta-Daten wie ontologische Strukturen und Bereichsdefinitionen von Prädikaten sind zwar wünschenswert und idealerweise durch eine Wissensbank verfügbar. Jedoch sind Wissensbanken im Kontext von LOD oft unvollständig oder einfach nicht verfügbar. Deshalb ist es nützlich automatisch Meta-Informationen, wie ontologische Abhängigkeiten, Bereichs-und Domänendefinitionen und thematische Assoziationen von Ressourcen generieren zu können. Eine neue und vielversprechende Technik um solche Daten zu untersuchen basiert auf das entdecken von Assoziationsregeln, welche ursprünglich für Verkaufsanalysen in transaktionalen Datenbanken angewendet wurde. Wir haben eine Adaptierung dieser Technik auf RDF Daten entworfen und stellen das Konzept der Mining Konfigurationen vor, welches uns befähigt in RDF Daten auf unterschiedlichen Weisen Muster zu erkennen. Verschiedene Konfigurationen erlauben uns Schema- und Wertbeziehungen zu erkennen, die für interessante Anwendungen genutzt werden können. In dem Sinne, stellen wir assoziationsbasierte Verfahren für eine Prädikatvorschlagsverfahren, Datenvervollständigung, Ontologieverbesserung und Anfrageerleichterung vor. Das Vorschlagen von Prädikaten behandelt das Problem der inkonsistenten Verwendung von Ontologien, indem einem Benutzer, der einen neuen Fakt einem Rdf-Datensatz hinzufügen will, eine sortierte Liste von passenden Prädikaten vorgeschlagen wird. Eine Kombinierung von verschiedenen Konfigurationen erweitert dieses Verfahren sodass automatisch komplett neue Fakten für eine Wissensbank generiert werden. Hierbei stellen wir zwei Verfahren vor, einen nutzergesteuertenVerfahren, bei dem ein Nutzer die Entität aussucht die erweitert werden soll und einen datengesteuerten Ansatz, bei dem ein Algorithmus selbst die Entitäten aussucht, die mit fehlenden Fakten erweitert werden. Da Wissensbanken stetig wachsen und sich verändern, ist ein anderer Ansatz um die Verwendung von RDF Daten zu erleichtern die Verbesserung von Ontologien. Hierbei präsentieren wir ein Assoziationsregeln-basiertes Verfahren, der Daten und zugrundeliegende Ontologien zusammenführt. Durch die Verflechtung von unterschiedlichen Konfigurationen leiten wir einen neuen Algorithmus her, der gleichbedeutende Prädikate entdeckt. Diese Prädikate können benutzt werden um Ergebnisse einer Anfrage zu erweitern oder einen Nutzer während einer Anfrage zu unterstützen. Für jeden unserer vorgestellten Anwendungen präsentieren wir eine große Auswahl an Experimenten auf Realweltdatensätzen. Die Experimente und Evaluierungen zeigen den Mehrwert von Assoziationsregeln-Generierung für die Integration und Nutzbarkeit von RDF Daten und bestätigen die Angemessenheit unserer konfigurationsbasierten Methodologie um solche Regeln herzuleiten.
40

Association Rule Based Classification

Palanisamy, Senthil Kumar 03 May 2006 (has links)
In this thesis, we focused on the construction of classification models based on association rules. Although association rules have been predominantly used for data exploration and description, the interest in using them for prediction has rapidly increased in the data mining community. In order to mine only rules that can be used for classification, we modified the well known association rule mining algorithm Apriori to handle user-defined input constraints. We considered constraints that require the presence/absence of particular items, or that limit the number of items, in the antecedents and/or the consequents of the rules. We developed a characterization of those itemsets that will potentially form rules that satisfy the given constraints. This characterization allows us to prune during itemset construction itemsets such that neither they nor any of their supersets will form valid rules. This improves the time performance of itemset construction. Using this characterization, we implemented a classification system based on association rules and compared the performance of several model construction methods, including CBA, and several model deployment modes to make predictions. Although the data mining community has dealt only with the classification of single-valued attributes, there are several domains in which the classification target is set-valued. Hence, we enhanced our classification system with a novel approach to handle the prediction of set-valued class attributes. Since the traditional classification accuracy measure is inappropriate in this context, we developed an evaluation method for set-valued classification based on the E-Measure. Furthermore, we enhanced our algorithm by not relying on the typical support/confidence framework, and instead mining for the best possible rules above a user-defined minimum confidence and within a desired range for the number of rules. This avoids long mining times that might produce large collections of rules with low predictive power. For this purpose, we developed a heuristic function to determine an initial minimum support and then adjusted it using a binary search strategy until a number of rules within the given range was obtained. We implemented all of our techniques described above in WEKA, an open source suite of machine learning algorithms. We used several datasets from the UCI Machine Learning Repository to test and evaluate our techniques.

Page generated in 0.0772 seconds