• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 52
  • 8
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 83
  • 83
  • 43
  • 26
  • 25
  • 23
  • 16
  • 14
  • 12
  • 10
  • 10
  • 10
  • 9
  • 9
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Evaluating Spatiotemporal Patterns in US Tornado Occurrence with Space Time Pattern Mining: 1950-2019 and 1980-2019

Wiser, Darrell, Luffman, I. E. 06 April 2022 (has links)
This research assesses shifts in tornado occurrence pattens in space and time employing continental United States tornado records with an Enhanced Fujita (EF) rating equal or greater than 1. In similar research, most researchers discard tornado records prior to 1980 due to factors including: magnitude anomalies related to development of the Fujita Scale, unpredictability in tornado reporting (escalating populace, storm spotters, and technologic improvements), and better data records from the Census Bureau. We therefore constructed two datasets using tornados recorded in the National Weather Service Storm Prediction Center’s Severe Weather GIS (SVRGIS) database: 1950-2019 (dataset 1) and 1980-2019 (dataset 2). The goals for this study were to 1) determine whether spatiotemporal patterns of recorded tornado activity have shifted over time, and 2) determine whether inclusion of pre-1980 tornado data changes the findings from 1). This study employed Space-Time Pattern Mining (STPM) to construct four spacetime cubes (STC) in ArcGIS Pro. Emerging Hot Spot Analysis (EHS) was employed to identify the changes in tornado occurrence (number of incidents in a STC cell) and magnitude (sum of tornado EF ratings for all incidents in a STC cell). EHS displayed increased tornado activity in the Southeast and decreased activity for areas in the Great Plains for both occurrence and magnitude in both datasets. This is interpreted as significant intensifying hot spots in the Southeast region and diminishing hot spots in the Great Plains indicating an east-south-east shift for both datasets. Similar findings for both datasets indicate that inclusion of the less reliable pre-1980’s tornado data does not change the results and we recommend that the practice of discarding pre-1980’s tornado data in tornado occurrence research be reconsidered.
32

Algorithmic Approaches to Pattern Mining from Structured Data / 構造データからのパターン発見におけるアルゴリズム論的アプローチ

Otaki, Keisuke 23 March 2016 (has links)
The contents of Chapter 6 are based on work published in IPSJ Transactions on Mathematical Modeling and Its Applications, vol.9(1), pp.32-42, 2016. / 京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第19846号 / 情博第597号 / 新制||情||104(附属図書館) / 32882 / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 山本 章博, 教授 鹿島 久嗣, 教授 阿久津 達也 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DGAM
33

Evaluating Spatial-Temporal Patterns in US Tornado Occurrence with Space Time Cube Analysis and Linear Kernel Density Estimation: 1950-2019

Wiser, Darrell L 01 August 2022 (has links)
This research estimated the spatial-temporal patterns of tornadoes in the continental United States from 1950-2019 using the National Weather Service Storm Prediction Center’s Severe Weather GIS (SVRGIS) database. This study employed Space-Time Cube Analysis and Linear Kernel Density (Kernel Density Linear Process, (KDLP)) rather than the standard Kernel Density Estimation (KDE) approach; to evaluate whether tornado hotspot locations and intensities shift over time. The first phase of the study utilized KDLP to map changes in tornado hotspots and qualitatively assess decadal shifts in hotspot locations and intensities by occurrence and magnitude between decades using ArcGIS Pro and CrimeStat. Next an Emerging Hot Spot Analysis (EHSA) was employed to identify the changes in tornado occurrence and magnitude. ESHA results identified, by both occurrence and magnitude, significant intensifying hot spots in the Southeast region and diminishing hot spots in the Great Plains indicating an east-south-east shift.
34

Sequential Pattern Mining on Electronic Medical Records for Finding Optimal Clinical Pathways

Edman, Henrik January 2018 (has links)
Electronic Medical Records (EMRs) are digital versions of paper charts, used to record the treatment of different patients in hospitals. Clinical pathways are used as guidelines for how to treat different diseases, determined by observing outcomes from previous treatments. Sequential pattern mining is a version of data mining where the data mined is organized in sequences. It is a common research topic in data mining with many new variations on existing algorithms being introduced frequently. In a previous report, the sequential pattern mining algorithm PrefixSpan was used to mine patterns in EMRs to verify or suggest new clinical pathways. It was found to only be able to verify pathways partially. One of the reasons stated for this was that PrefixSpan was too inefficient to be able to mine at a low enough support to consider some items. In this report CSpan is used instead, since it is supposed to outperform PrefixSpan by up to two orders of magnitude, in order to improve runtime and thereby address the problems mentioned in the previous work. The results show that CSpan did indeed improve the runtime and the algorithm was able to mine at a lower minimum support. However, the output was only barely improved. / Electronic Medical Records (EMRs) är digitala versioner av behandlingshistoriken för patienter på sjukhus. Clinical pathways används som riktlinjer för hur olika sjukdomar borde behandlas, vilka bestäms genom att observera utkomsten av tidigare behandlingar. Sequential pattern mining är en typ av data mining där datan som behandlas är strukturerad i sekvenser. Det är ett vanligt forskningsområde inom data mining där många nya variationer av existerande algoritmer introduceras frekvent. I en tidigare rapport användes sequential pattern mining algoritmen PrefixSpan på EMRs för att verifiera eller föreslå nya clinical pathways. Den kunde dock endast verifiera pathways delvis. En av anledningarna som nämndes för detta var att PrefixSpan var för ineffektiv för att kunna köras med en tillräckligt låg support för att kunna finna vissa åtgärder i en behandling. I den här rapporten används istället CSpan, eftersom den ska överprestera PrefixSpan med upp till två storleksordningar, för att förbättra körningstiden och därmed adressera problemen som nämns i den tidigare rapporten. Resultaten visar att CSpan förbättrade körningstiden och algoritmen kunde köras med lägre support. Däremot blev utdatan knappt förbättrad.
35

Migration Motif: A Spatial-Temporal Pattern Mining Approach for Financial Markets

Du, Xiaoxi 08 April 2009 (has links)
No description available.
36

New techniques for efficiently discovering frequent patterns

Jin, Ruoming 01 August 2005 (has links)
No description available.
37

Scalable mining on emerging architectures

Buehrer, Gregory T. 07 January 2008 (has links)
No description available.
38

Sequential Pattern Mining: A Proposed Approach for Intrusion Detection Systems

Lefoane, Moemedi, Ghafir, Ibrahim, Kabir, Sohag, Awan, Irfan U. 19 December 2023 (has links)
No / Technological advancements have played a pivotal role in the rapid proliferation of the fourth industrial revolution (4IR) through the deployment of Internet of Things (IoT) devices in large numbers. COVID-19 caused serious disruptions across many industries with lockdowns and travel restrictions imposed across the globe. As a result, conducting business as usual became increasingly untenable, necessitating the adoption of new approaches in the workplace. For instance, virtual doctor consultations, remote learning, and virtual private network (VPN) connections for employees working from home became more prevalent. This paradigm shift has brought about positive benefits, however, it has also increased the attack vectors and surfaces, creating lucrative opportunities for cyberattacks. Consequently, more sophisticated attacks have emerged, including the Distributed Denial of Service (DDoS) and Ransomware attacks, which pose a serious threat to businesses and organisations worldwide. This paper proposes a system for detecting malicious activities in network traffic using sequential pattern mining (SPM) techniques. The proposed approach utilises SPM as an unsupervised learning technique to extract intrinsic communication patterns from network traffic, enabling the discovery of rules for detecting malicious activities and generating security alerts accordingly. By leveraging this approach, businesses and organisations can enhance the security of their networks, detect malicious activities including emerging ones, and thus respond proactively to potential threats.
39

Scalable Performance Assessment of Industrial Assets: A Data Mining Approach

Dagnely, Pierre 21 June 2019 (has links) (PDF)
Nowadays, more and more industrial assets are continuously monitored and generate vast amount of event logs and sensor data. Data Mining is the field concerned with the exploration and exploitation of these data. Despite the fact that data mining has been researched for decades, the event log data are still underexploited in most data mining workflows although they could provide valuable insights on the asset behavior as they represent the internal processes of an asset. However, exploitation of event log data is challenging, mainly as: 1) event labels are not consistent across manufacturers, 2) assets report vast amount of data from which only a small part may be relevant, 3) textual event logs and numerical sensor data are usually processed by methods dedicated respectively to textual data or sensor data, methods combining both types of data are still missing, 4) industrial data are rarely labelled, i.e. there is no indication on the actual performance of the asset and it has to be derived from other sources, 5) the meaning of an event may vary depending on the events send after or before.Concretely, this thesis is concerned with the conception and validation of an integrated data processing framework for scalable performance assessment of industrial asset portfolios. This framework is composed of several advanced methodologies facilitating exploitation of both event logs and time series sensor data: 1) an ontology model describing photovoltaic (the validation domain) event system allowing the integration of heterogeneous event generated by various manufacturers; 2) a novel and computationally scalable methodology enabling automatic calculation of event relevancy score without any prior knowledge; 3) a semantically enriched multi-level pattern mining methodology enabling data exploration and hypothesis building across heterogeneous assets; 4) an advanced workflow extracting performance profiles by combining textual event logs and numerical sensor values; 5) a scalable methodology allowing rapid annotation of new asset runs with a known performance label only based on the event logs data.The framework has been exhaustively validated on real-world data from PV plants, provided by our industrial partner 3E. However, the framework has been designed to be domain agnostic and can be adapted to other industrial assets reporting event logs and sensor data. / Doctorat en Sciences de l'ingénieur et technologie / info:eu-repo/semantics/nonPublished
40

Exctraction de chroniques discriminantes / Discriminant chronicle mining

Dauxais, Yann 13 April 2018 (has links)
De nombreuses données sont enregistrées dans le cadre d'applications variées et leur analyse est un challenge abordé par de nombreuses études. Parmi ces différentes applications, cette thèse est motivée par l'analyse de parcours patients pour mener des études de pharmaco-épidémiologie. La pharmaco-épidémiologie est l'étude des usages et effets de produits de santé au sein de populations définies. Le but est donc d'automatiser ce type d'étude en analysant des données. Parmi les méthodes d'analyses de données, les approches d'extraction de motifs extraient des descriptions de comportements, appelées motifs, caractérisant ces données. L'intérêt principal de telles approches est de donner un aperçu des comportements décrivant les données. Dans cette thèse, nous nous intéressons à l'extraction de motifs temporels discriminants au sein de séquences temporelles, c'est-à-dire une liste d'évènements datés. Les motifs temporels sont des motifs représentant des comportements par leur dimension temporelle. Les motifs discriminants sont des motifs représentant les comportements apparaissant uniquement pour une sous-population bien définie. Alors que les motifs temporels sont essentiels pour décrire des données temporelles et que les motifs discriminants le sont pour décrire des différences de comportement, les motifs temporels discriminants ne sont que peu étudiés. Dans cette thèse, le modèle de chronique discriminante est proposé pour combler le manque d'approches d'extraction de motifs temporels discriminants. Une chronique est un motif temporelle représentable sous forme de graphe dont les nœuds sont des évènements et les arêtes sont des contraintes temporelles numériques. Le modèle de chronique a été choisi pour son expressivité concernant la dimension temporelle. Les chroniques discriminantes sont, de ce fait, les seuls motifs temporels discriminants représentant numériquement l'information temporelle. Les contributions de cette thèse sont : (i) un algorithme d'extraction de chroniques discriminantes (DCM), (ii) l'étude de l'interprétabilité du modèle de chronique au travers de sa généralisation et (iii) l'application de DCM sur des données de pharmaco-épidémiologie. L'algorithme DCM est dédié à l'extraction de chroniques discriminantes et basé sur l'algorithme d'extraction de règles numériques Ripperk . Utiliser Ripperk permet de tirer avantage de son efficacité et de son heuristique incomplète évitant la génération de motifs redondants. La généralisation de cet algorithme permet de remplacer Ripperk par n'importe quel algorithme de machine learning. Les motifs extraits ne sont donc plus forcément des chroniques mais une forme généralisée de celles-ci. Un algorithme de machine learning plus expressif extrait des chroniques généralisées plus expressives mais impacte négativement leur interprétabilité. Le compromis entre ce gain en expressivité, évalué au travers de la précision de classification, et cette perte d'interprétabilité, est comparé pour plusieurs types de chroniques généralisées. L'intérêt des chroniques discriminantes à représenter des comportements et l'efficacité de DCM est validée sur des données réelles et synthétiques dans le contexte de classification à base de motifs. Des chroniques ont finalement été extraites à partir des données de pharmaco-épidémiologie et présentées aux cliniciens. Ces derniers ont validés l'intérêt de celles-ci pour décrire des comportements d'épidémiologie discriminants. / Data are recorded for a wide range of application and their analysis is a great challenge addressed by many studies. Among these applications, this thesis was motivated by analyzing care pathway data to conduct pharmaco-epidemiological studies. Pharmaco-epidemiology is the study of the uses and effects of healthcare products in well defined populations. The goal is then to automate this study by analyzing data. Within the data analysis approaches, pattern mining approaches extract behavior descriptions, called patterns, characterizing the data. Patterns are often easily interpretable and give insights about hidden behaviors described by the data. In this thesis, we are interested in mining discriminant temporal patterns from temporal sequences, i.e. a list of timestamped events. Temporal patterns represent expressively behaviors through their temporal dimension. Discriminant patterns are suitable adapted for representing behaviors occurring specifically in small subsets of a whole population. Surprisingly, if temporal patterns are essential to describe timestamped data and discriminant patterns are crucial to identify alternative behaviors that differ from mainstream, discriminant temporal patterns received little attention up to now. In this thesis, the model of discriminant chronicles is proposed to address the lack of interest in discriminant temporal pattern mining approaches. A chronicle is a temporal pattern representable as a graph whose nodes are events and vertices are numerical temporal constraints. The chronicle model was choosen because of its high expressiveness when dealing with temporal sequences and also by its unique ability to describe numerically the temporal dimension among other discriminant pattern models. The contribution of this thesis, centered on the discriminant chronicle model, is threefold: (i) a discriminant chronicle model mining algorithm (DCM), (ii) the study of the discriminant chronicle model interpretability through its generalization and (iii) the DCM application on a pharmaco-epidemiology case study. The DCM algorithm is an efficient algorithm dedicated to extract discriminant chronicles and based on the Ripperk numerical rule learning algorithm. Using Ripperk allows to take advantage to its efficiency and its incomplete heuristic dedicated to avoid redundant patterns. The DCM generalization allows to swap Ripperk with alternative machine learning algorithms. The extracted patterns are not chronicles but a generalized form of chronicles. More expressive machine learning algorithms extract more expressive generalized chronicles but impact negatively their interpretability. The trade-off between this expressiveness gain, evaluated by classification accuracy, and this interpretability loss, is compared for several types of generalized chronicles. The interest of the discriminant chronicle model and the DCM efficiency is validated on synthetic and real datasets in pattern-based classification context. Finally, chronicles are extracted from a pharmaco-epidemiology dataset and presented to clinicians who validated them to be interesting to describe epidemiological behaviors.

Page generated in 0.0784 seconds