1 |
A High Growth-Rate Emerging Pattern for Data Classification in Microarray DatasetsYang, Tsung-Bin 13 July 2007 (has links)
Data classification is one of important techniques in data mining. This technique has
been applied widely in many applications, e.g., disease diagnosis. Recently, the data
classification technique has been be used for microarray datasets, where a microarray
is a very good tool to study the gene expression levels in Bioinformatics. In the
part of data classification problem for microarray datasets, we consider two biology
datasets which reflect two extreme different classes for the given same sets of tests.
Basically, the classification process contains two phases: (1) the training phase, and
(2) the testing phase. The propose of the training phase is to find the representative
Emerging Patterns (EPs) in each of these two datasets, where an EP is an itemset
which satisfies some conditions of the growth rate from one dataset to another dataset.
Note that the growth rate represents the differences between these two datasets. After
the training phase, we take the collections of EPs in each dataset as a classifier. A
test sample in the testing phase will be predicted to one of the two datasets based on
the result of a similarity function, which takes the growth rate and the support into
consideration. The evaluating criteria of a classifier is the accuracy. Obviously, the
higher the accuracy of a classifier is, the better the performance is. Therefore, several
EP-based classifiers, e.g., the EJEP and the NEP strategies, have been proposed to
achieve this goal. The EJEP strategy considers only those itemsets whose growth
rates are infinite, since it claims that the high growth rates may result in the high
accuracy. However, the EJEP strategy will not keep those useful EPs whose growth
rates are very high but not infinite. On the other hand, the real-world data always
contains noises. The NEP strategy considers noises and provides the higher accuracy
than the EJEP strategy. However, it still may miss some itemsets with high growth
rates, which may result in the low accuracy. Therefore, in this thesis, we propose
a High Growth-rate EP (HGEP) strategy to improve the disadvantages of the NEP
and the EJEP strategies. In addition to considering itemsets whose growth rates
are infinite in the EJEP strategy and noise patterns in the NEP strategy, our HGEP
strategy considers those itemsets which have the growth rate higher than all its proper
subsets when the growth rates are finite. In this way, the itemsets with high growth
rates could result in high similarity, and the high similarity predicts the sets of tests
into the correct class. Therefore, our HGEP can provide high accuracy. In our
performance study, we use several real datasets to evaluate the average accuracy
of them. Moreover, we also do simulation study of increasing noises. From the
experiment results, we show that the average accuracy of our HGEP strategy is
higher than that of the NEP strategy.
|
2 |
Efficient mining of interesting emerging patterns and their effective use in classificationFan, Hongjian Unknown Date (has links) (PDF)
Knowledge Discovery in Databases (KDD), or Data Mining is used to discover interesting or useful patterns and relationships in data, with an emphasis on large volume of observational databases. Among many other types of information (knowledge) that can be discovered in data, patterns that are expressed in terms of features are popular because they can be understood and used directly by people. The recently proposed Emerging Pattern (EP) is one type of such knowledge patterns. Emerging Patterns are sets of items (conjunctions of attribute values) whose frequency change significantly from one dataset to another. They are useful as a means of discovering distinctions inherently present amongst a collection of datasets and have been shown to be a powerful method for constructing accurate classifiers. (For complete abstract open document)
|
3 |
OCLEP+: One-Class Intrusion Detection Using Length of PatternsPentukar, Sai Kiran 06 June 2017 (has links)
No description available.
|
4 |
Μελέτη, σχεδίαση και ανάπτυξη συστήματος για την παροχή υπηρεσιών φροντίδας σε χρόνιες παθήσεις, με την ενσωμάτωση αναγνώρισης της φυσικής δραστηριότητας και τη χρήση τεχνολογιών τηλεματικήςΚουρής, Ιωάννης 09 July 2013 (has links)
Στην παρούσα διδακτορική διατριβή εξετάζονται οι δυνατότητες που προσφέρουν τα έξυπνα κινητά τηλέφωνα (smartphones) στην παροχή υπηρεσιών φροντίδας σε άτομα με χρόνιες παθήσεις, μέσω των τεχνολογιών τηλεματικής. Για το σκοπό αυτό μελετήθηκε, σχεδιάστηκε και αναπτύχθηκε ένα δίκτυο φορετών ασύρματων αισθητήρων για την αναγνώριση της φυσικής δραστηριότητας, το οποίο καταγράφει δεδομένα της κίνησης και βιολογικά σήματα, τα οποία στη συνέχεια επεξεργάζονται για την αναγνώριση της δραστηριότητας που εκτελείται, σε πραγματικό χρόνο. Σε σχέση με τις μέχρι σήμερα προσεγγίσεις, στην παρούσα εργασία γίνεται συγκριτική μελέτη πολλαπλών τεχνικών αναγνώρισης προτύπων καθώς και τεχνικών που δεν έχουν χρησιμοποιηθεί μέχρι σήμερα, ενώ γίνεται εξέταση των αποτελεσμάτων που προκύπτουν κάνοντας χρήση του συνδυασμού μικρότερου αριθμού δεδομένων. Η πληροφορία της αναγνώρισης της φυσικής δραστηριότητας συνδυάζεται στη συνέχεια με περιβαλλοντικά δεδομένα, ώστε να μελετηθούν τα μοτίβα της καθημερινής δραστηριότητας υγειών ατόμων και ατόμων με χρόνιες παθήσεις. Με την αναζήτηση Emerging Patterns στα αποθηκευμένα δεδομένα, εξετάζεται ο βαθμός συμμόρφωσης στις ιατρικές οδηγίες, αλλά οι δυνατότητες πρόβλεψης των βραχυπρόθεσμων και μακροπρόθεσμων επιπλοκών των χρόνιων παθήσεων. / The present PhD thesis examines the potentials of the usage of the smartphones in order to offer health services to patients with chronic diseases. A wearable wireless sensor network designed and developed in order to record body movement and biosignal data. Physical activity recognition techniques are applied to the recorded data, so that to extract the actual activities performed, in real time. In contrast to the research that has been carried out till today, an extensive comparison between different pattern recognition techniques is performed using all the recorded data and a reduced number of them, applying newly proposed pattern recognition. Furthermore, the recognized physical activities are combined with environmental data, in order to study the daily activity patterns of healthy persons and persons with chronic diseases. Searching for Emerging Patterns in the data, patient conformance to the medical advices, along with short and long term complications of chronic diseases are examined.
|
5 |
Extraction et sélection de motifs émergents minimaux : application à la chémoinformatique / Extraction and selection of minimal emerging patterns : application to chemoinformaticsKane, Mouhamadou bamba 06 September 2017 (has links)
La découverte de motifs est une tâche importante en fouille de données. Cemémoire traite de l’extraction des motifs émergents minimaux. Nous proposons une nouvelleméthode efficace qui permet d’extraire les motifs émergents minimaux sans ou avec contraintede support ; contrairement aux méthodes existantes qui extraient généralement les motifs émergentsminimaux les plus supportés, au risque de passer à côté de motifs très intéressants maispeu supportés par les données. De plus, notre méthode prend en compte l’absence d’attributqui apporte une nouvelle connaissance intéressante.En considérant les règles associées aux motifs émergents avec un support élevé comme desrègles prototypes, on a montré expérimentalement que cet ensemble de règles possède unebonne confiance sur les objets couverts mais malheureusement ne couvre pas une bonne partiedes objets ; ce qui constitue un frein pour leur usage en classification. Nous proposons uneméthode de sélection à base de prototypes qui améliore la couverture de l’ensemble des règlesprototypes sans pour autant dégrader leur confiance. Au vu des résultats encourageants obtenus,nous appliquons cette méthode de sélection sur un jeu de données chimique ayant rapport àl’environnement aquatique : Aquatox. Cela permet ainsi aux chimistes, dans un contexte declassification, de mieux expliquer la classification des molécules, qui sans cette méthode desélection serait prédites par l’usage d’une règle par défaut. / Pattern discovery is an important field of Knowledge Discovery in Databases.This work deals with the extraction of minimal emerging patterns. We propose a new efficientmethod which allows to extract the minimal emerging patterns with or without constraint ofsupport ; unlike existing methods that typically extract the most supported minimal emergentpatterns, at the risk of missing interesting but less supported patterns. Moreover, our methodtakes into account the absence of attribute that brings a new interesting knowledge.Considering the rules associated with emerging patterns highly supported as prototype rules,we have experimentally shown that this set of rules has good confidence on the covered objectsbut unfortunately does not cover a significant part of the objects ; which is a disavadntagefor their use in classification. We propose a prototype-based selection method that improvesthe coverage of the set of the prototype rules without a significative loss on their confidence.We apply our prototype-based selection method to a chemical data relating to the aquaticenvironment : Aquatox. In a classification context, it allows chemists to better explain theclassification of molecules, which, without this method of selection, would be predicted by theuse of a default rule.
|
6 |
Leveraging formal concept analysis and pattern mining for moving object trajectory analysis / Exploitation de l'analyse formelle de concepts et de l'extraction de motifs pour l'analyse de trajectoires d'objets mobilesAlmuhisen, Feda 10 December 2018 (has links)
Cette thèse présente un cadre de travail d'analyse de trajectoires contenant une phase de prétraitement et un processus d’extraction de trajectoires d’objets mobiles. Le cadre offre des fonctions visuelles reflétant le comportement d'évolution des motifs de trajectoires. L'originalité de l’approche est d’allier extraction de motifs fréquents, extraction de motifs émergents et analyse formelle de concepts pour analyser les trajectoires. A partir des données de trajectoires, les méthodes proposées détectent et caractérisent les comportements d'évolution des motifs. Trois contributions sont proposées : Une méthode d'analyse des trajectoires, basée sur les concepts formels fréquents, est utilisée pour détecter les différents comportements d’évolution de trajectoires dans le temps. Ces comportements sont “latents”, "emerging", "decreasing", "lost" et "jumping". Ils caractérisent la dynamique de la mobilité par rapport à l'espace urbain et le temps. Les comportements détectés sont visualisés sur des cartes générées automatiquement à différents niveaux spatio-temporels pour affiner l'analyse de la mobilité dans une zone donnée de la ville. Une deuxième méthode basée sur l'extraction de concepts formels séquentiels fréquents a également été proposée pour exploiter la direction des mouvements dans la détection de l'évolution. Enfin, une méthode de prédiction basée sur les chaînes de Markov est présentée pour prévoir le comportement d’évolution dans la future période pour une région. Ces trois méthodes sont évaluées sur ensembles de données réelles . Les résultats expérimentaux obtenus sur ces données valident la pertinence de la proposition et l'utilité des cartes produites / This dissertation presents a trajectory analysis framework, which includes both a preprocessing phase and trajectory mining process. Furthermore, the framework offers visual functions that reflect trajectory patterns evolution behavior. The originality of the mining process is to leverage frequent emergent pattern mining and formal concept analysis for moving objects trajectories. These methods detect and characterize pattern evolution behaviors bound to time in trajectory data. Three contributions are proposed: (1) a method for analyzing trajectories based on frequent formal concepts is used to detect different trajectory patterns evolution over time. These behaviors are "latent", "emerging", "decreasing", "lost" and "jumping". They characterize the dynamics of mobility related to urban spaces and time. The detected behaviors are automatically visualized on generated maps with different spatio-temporal levels to refine the analysis of mobility in a given area of the city, (2) a second trajectory analysis framework that is based on sequential concept lattice extraction is also proposed to exploit the movement direction in the evolution detection process, and (3) prediction method based on Markov chain is presented to predict the evolution behavior in the future period for a region. These three methods are evaluated on two real-world datasets. The obtained experimental results from these data show the relevance of the proposal and the utility of the generated maps
|
Page generated in 0.1082 seconds