• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 11
  • 2
  • Tagged with
  • 15
  • 15
  • 15
  • 7
  • 6
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Climatology of warm season heat waves in Saudi Arabia: a time-sensitive approach

Alghamdi, Ali Saeed Arifi January 1900 (has links)
Doctor of Philosophy / Department of Geography / John A. Harrington Jr / The climate of the Middle East is warming and extreme hot temperature events are becoming more common, as observed by the significant upward trends in mean and extreme temperatures during the last few decades. Climate modeling studies suggest that the frequency, intensity, and duration of extreme temperature events are expected to increase as the global and local climate continues to warm. Existing literature about heat waves (HWs) in Saudi Arabia provides information about HW duration using a single index, without considering the observed effects of climate change and the subtropical arid climate. With that in mind, this dissertation provides a series of three stand-alone papers evaluating temporal, geographic, and atmospheric aspects of the character of warm season (May-September) HWs in Saudi Arabia for 1985 to 2014. Chapter 2 examines the temporal behavior(s) of the frequency, duration, and intensity of HWs under the observed recent climate change. Several issues are addressed including the identification of some improved methodological practices for HW indices. A time-sensitive approach to define and detect HWs is proposed and assessed. HW events and their duration are considered as count data; thus, different Poisson models were used for trend detection. Chapter 3 addresses the spatio-temporal patterns of the frequency and intensity of hot days and nights, and HWs. The chapter reemphasizes the importance of considering the on-goings effects of climate warming and applies a novel time-series clustering approach to recognize hot temperature event behavior through time and space. Chapter 4 explores the atmospheric circulation conditions that are associated with warm season HW event occurrence and how different HWs aspects are related to different circulation types. Further, possible teleconnections between HWs and sea surface temperature (SST) anomalies of nearby large bodies are examined. Results from Chapters 2 and 3 detected systematic upward trends in maximum and minimum temperatures at most of the 25 stations, suggesting an on-going change in the climatology of the upper-tail of the frequency distribution. The analysis demonstrated the value of using a time-sensitive approach in studying extreme thermal events. Different patterns were observed over time and space not only across stations but also among extreme temperature events (i.e., hot days and nights, and HWs). The overall results suggest that not only local and regional factors, such as elevation, latitude, land cover, atmospheric humidity, and distance from a large body of water, but also large-scale factors such as atmospheric circulation patterns are responsible for the observed temporal and spatial patterns. Chapter 4 confirmed that as the Indian Summer Monsoon Trough and the Arabian heat low were key atmospheric features related to HW days. SST anomalies seemed to be a more important factor for HWs intensity. Extreme thermal events in Saudi Arabia tended to occur during regional warming due to atmospheric circulation conditions and SSTs teleconnections. This study documents the value of a time-sensitive approach and should initiate further research as some of temporal and spatial variabilities were not fully explained
2

Forecasting With Feature-Based Time Series Clustering

Tingström, Conrad, Åkerblom Svensson, Johan January 2023 (has links)
Time series prediction plays a pivotal role in various areas, including for example finance, weather forecasting, and traffic analysis. In this study, time series of historical sales data from a packaging manufacturer is used to investigate the effects that clustering such data has on forecasting performance. An experiment is carried out in which the time series data is first clustered using two separate approaches: k-means and Self-Organizing Map (SOM). The clustering is feature-based, meaning that characteristics extracted from the time series are used to compute similarity, rather than the raw time series. Then, A set of Long Short-Term Memory models (LSTMs) are trained; one that is trained on the entire dataset (global model), separate models trained on each of the clusters (cluster-based models), and finally a number of models trained on individual time series that are proportionally sampled from the clusters (single models). By evaluating the LSTMs based on Mean Absolute Error (MAE) and Mean Squared Error (MSE), we assess their consistency and predictive potential. The results reveal a trade-off between the consistency and predictive performance of the models. The global LSTM model consistently exhibits more stable performance across all predictions, showcasing its ability to capture the overall patterns in the data. However, the cluster-based LSTM models demonstrate potential for improved predictive performance within specific clusters, albeit with higher variability. This suggests that certain clusters possess distinct characteristics that allow for better predictions within those subsets of the data. Finally, the single LSTM models trained on individual time series, showcase even wider spreads of scores. The analysis suggests that the availability of training data plays a crucial role in the robustness (i.e., the ability to consistently produce similar results) of the forecasting models, with the global model benefiting from a larger training set. The higher variability in performance seen for the models trained with smaller training sets indicates that certain time series may be easier or harder to predict. It seems that the noise that comes with a larger training set can be either beneficial or detrimental to the predictive performance of the forecasting model on any individual time series, depending on the characteristics of that particular sample. Further analysis is required to investigate the factors contributing to the varying performance within each cluster. Exploring feature scores associated with poorly performing clusters and identifying the key features that contribute to better performance in certain clusters could provide valuable insights. Understanding these factors might aid in developing tailored strategies for cluster-specific prediction tasks.
3

Some new anomaly detection methods with applications to financial data

Zhao, Zhicong 06 August 2021 (has links)
Novel clustering methods are presented and applied to financial data. First, a scan-statistics method for detecting price point clusters in financial transaction data is considered. The method is applied to Electronic Business Transfer (EBT) transaction data of the Supplemental Nutrition Assistance Program (SNAP). For a given vendor, transaction amounts are fit via maximum likelihood estimation which are then converted to the unit interval via a natural copula transformation. Next, a new Markov type relation for order statistics on the unit interval is developed. The relation is used to characterize the distribution of the minimum exceedance of all copula transformed transaction amounts above an observed order statistic. Conditional on observed order statistics, independent and asymptotically identical indicator functions are constructed and the success probably as a function of the gaps in consecutive order statistics is specified. The success probabilities are shown to be a function of the hazard rate of the transformed transaction distribution. If gaps are smaller than expected, then the corresponding indicator functions are more likely to be one. A scan statistic is then applied to the sequence of indicator functions to detect locations where too many gaps are smaller than expected. These sets of gaps are then flagged as being anomalous price point clusters. It is noted that prominent price point clusters appearing in the data may be a historical vestige of previous versions of the SNAP program involving outdated paper "food stamps". The second part of the project develops a novel clustering method whereby the time series of daily total EBT transaction amounts are clustered by periodicity. The schemeworks by normalizing the time series of daily total transaction amounts for two distinct vendors and taking daily differences in those two series. The difference series is then examined for periodicity via a novel F statistic. We find one may cluster the monthly periodicities of vendors by type of store using the F statistic, a proxy for a distance metric. This may indicate that spending preferences for SNAP benefit recipients varies by day of the month, however, this opens further questions about potential forcing mechanisms and the apparent changing appetites for spending.
4

Migration Motif: A Spatial-Temporal Pattern Mining Approach for Financial Markets

Du, Xiaoxi 08 April 2009 (has links)
No description available.
5

A Framework for Generalizing Uncertainty in Mobile Network Traffic Prediction

Downey, Alexander Roman 30 May 2024 (has links)
As Next Generation (NextG) networks become more complex, it has become increasingly necessary to utilize more advanced algorithms to enhance the robustness, autonomy, and reliability of existing wireless infrastructure. One such algorithm is network traffic prediction, playing a crucial role in the efficient operation of real-time and near-real-time network management. The contributions of this thesis are twofold. The first introduces a novel cluster-train-predict framework that leverages domain knowledge to identify unique timeseries sub-behaviors within aggregates of network data. This method produces distributions that are more robust towards changes in the spatio-temporal environment. The ensemble of time-series prediction models trained on these distributions posses a greater affinity towards accurate network prediction, selectively employing learned behaviors to handle sources of time-series data without any prior knowledge of it. This approach tends to improve the ability to accurately forecast network traffic volumes. Secondly, this thesis explains the development and implementation of a modular data pipeline to support the cluster-train-predict framework under a variety of conditions. This setup promotes repeatable and comparable results, facilitating rapid iteration and experimentation on current and future research. The results of this thesis surpass traditional approaches in [1] by up to 60%. Furthermore, the effectiveness of this framework is also validated using two additional time-series datasets [2] and [3], demonstrating the ability of this approach to generalize towards other time-series data and machine learning applications in uncertain environments. / Master of Science / As Next Generation (NextG) networks become more complex, it has become increasingly necessary to utilize more advanced algorithms to enhance the robustness, autonomy, and reliability of in-use wireless infrastructure where network traffic prediction plays a crucial role in the efficient operation of real-time and near real-time network management. The contributions of this thesis are twofold. The first explores a novel cluster-train-predict framework that uses an unsupervised learning approach, specifically time-series K-means clustering, to group similar time-series data. In doing so, this approach identifies unique time-series behaviors within network provider data. Since this approach aims to reduce the variance within each aggregate, models can specialize towards particular network behaviors, becoming better suited for a wider variety of network trends during prediction. Because this framework assigns data to each cluster based on these groupings, the framework can adapt towards changes in network infrastructure or underlying shifts in its environment to forecast with a greater degree of certainty and explainability. This framework can even generalize towards out-of-distribution cases where it has no prior knowledge of a source of time-series data outperforming [1] by up to 60%. This approach tends to improve the ability to accurately forecast network traffic volumes. Secondly, this thesis explains the development and implementation of a modular data pipeline to support the cluster-train-predict framework under a variety of conditions with repeatable and comparable results, facilitating rapid iteration and experimentation on current and future research. The results of the framework are also corroborated on two, additional time-series datasets [2] and [3], demonstrating the ability of this approach to generalize towards time-series data, where this framework can also be applied to other machine learning applications in uncertain environments.
6

Pattern Mining and Recognition in 5G Network Traffic Using Time Series Clustering / Mönsterextraktion och igenkänning i 5G-nätverkstrafik med tidsseriekluster

Turner, Connor January 2024 (has links)
The adoption of 5G mobile networks is changing the way we connect our world. Now, it is not just phones that are connected to the network, it is everything - smart homes, self-driving cars, factory equipment, and anything in between. Because of this, there has been a large increase in the volume and complexity of mobile network traffic in recent years. As 5G becomes more widely adopted, this trend will continue moving forward. This presents a problem for mobile network operators. To account for this increase in traffic volume and complexity, the network must be optimized to handle it. However, the only way to do this is to better understand the traffic sent over the network. As such, the companies building and operating these networks rely on models that can define a set of traffic profiles from real-world network data. This thesis presents a novel method of identifying traffic profiles from 5G network data by analyzing the network traffic as unstructured time series data. Using two datasets containing TCP and UDP traffic data with 10 million time series apiece, clusters were defined for each using time series clustering techniques. Specifically, the ROCKET family of algorithms was adapted for clustering purposes, applying k-means clustering on top of the ROCKET feature transformations. The resulting clusters were analyzed and compared to another clustering model - one based on summary statistics from each time series. Overall, the ROCKET models appeared to produce more coherent traffic profiles compared to the baseline clustering model, and the proposed framework shows great promise - not just in network traffic clustering, but any analysis of unstructured time series data.
7

Predicting EV Charging Sessions Based on Time Series Clustering : A Case Study from a Parking Garage in Uppsala

Palmlöf, Otto January 2024 (has links)
Electric vehicles play a crucial role in the global transition towards sustainability, particularly highlighted in initiatives like the European Green Deal. With projections indicating a significant increase in electric vehicle adoption worldwide, including a notable surge in the EU and Sweden, the strain on existing electric infrastructure becomes a concern. Managed charging – the process of regulating the charging of electric vehicles in a coordinated manner – emergesas a promising strategy to mitigate this strain, optimizing charging schedules to alleviate peakloads, and reduce the need for extensive grid upgrades. However, naive peak shaving approaches may fall short in addressing systemic issues, prompting the need for smarter solutions based on predictive modelling. This thesis focuses on Dansmästaren, a parking garage designed for mass electric vehicle charging, located in Uppsala, Sweden. Through load shifting techniques, one approach being explored at Dansmästaren aims to avoid grid capacity constraints by strategically scheduling EV charging to off-peak hours. This is being done using smart charging, which utilizes predictive models to predict charging durations for the scheduling of EV charging. This thesis aims to aid such predictive models by constructing a new feature for these models totrain on, namely clusters. These clusters are created using time series clustering, a technique that groups time series to clusters by running a range of algorithms comparing the similarity of different time series to each other using a variety of distance measures. In this case, the study uses data collected during three months in the form of time series, split by charging sessions, to construct the clusters. The performance of these clusters are then tested using deep learning as a predictive model to evaluate whether or not, and to which degree, the construction of clusters helped the predictive model achieve better results. Different approaches and algorithms are tested and evaluated for the time series clustering with the intention of getting the best possible performance — here meaning the specific construction of clusters resulting in the best performance increase for overall predictions. Different approaches were also tested and evaluated for the deep learning model, although not to the same extent, since the time series clustering is the focus of this thesis. In the end, a predictive performance increase of up to 17% was achieved by the predictive model using the constructed clusters as an additional feature. This suggests that time series clustering can aid deep learning models better predict charging durations.
8

Structural time series clustering, modeling, and forecasting in the state-space framework

Tang, Fan 15 December 2015 (has links)
This manuscript consists of two papers that formulate novel methodologies pertaining to time series analysis in the state-space framework. In Chapter 1, we introduce an innovative time series forecasting procedure that relies on model-based clustering and model averaging. The clustering algorithm employs a state-space model comprised of three latent structures: a long-term trend component; a seasonal component, to capture recurring global patterns; and an anomaly component, to reflect local perturbations. A two-step clustering algorithm is applied to identify series that are both globally and locally correlated, based on the corresponding smoothed latent structures. For each series in a particular cluster, a set of forecasting models is fit, using covariate series from the same cluster. To fully utilize the cluster information and to improve forecasting for a series of interest, multi-model averaging is employed. We illustrate the proposed technique in an application that involves a collection of monthly disease incidence series. In Chapter 2, to effectively characterize a count time series that arises from a zero-inflated binomial (ZIB) distribution, we propose two classes of statistical models: a class of observation-driven ZIB (ODZIB) models, and a class of parameter-driven ZIB (PDZIB) models. The ODZIB model is formulated in the partial likelihood framework. Common iterative algorithms (Newton-Raphson, Fisher Scoring, and Expectation Maximization) can be used to obtain the maximum partial likelihood estimators (MPLEs). The PDZIB model is formulated in the state-space framework. For parameter estimation, we devise a Monte Carlo Expectation Maximization (MCEM) algorithm, using particle methods to approximate the intractable conditional expectations in the E-step of the algorithm. We investigate the efficacy of the proposed methodology in a simulation study, and illustrate its utility in a practical application pertaining to disease coding.
9

Optimized material flow using unsupervised time series clustering : An experimental study on the just in time supermarket for Volvo powertrain production Skövde.

Darwish, Amena January 2019 (has links)
Machine learning has achieved remarkable performance in many domains, now it promising to solve manufacturing problems — a new ongoing trend of using machine learning in industrial applications. Dealing with the material order demand in manufacturing as time-series sequences, making unsupervised time-series clustering possible to apply. This study aims to evaluate different time-series clustering approaches, algorithms, and distance measures in material flow data. Three different approaches are evaluated; statistical clustering approaches; raw based and shape-based approaches and at last feature-based approach. The objectives are to categorize the materials in the supermarket (intermediate storage area to store materials before assembling the products) into three different flows according to their time-series properties. The experimental shows that feature-based approach is performed best for the data. A features filter is applied to keep the relevant features, that catch the unique characteristics from the data the predicted output. As a conclusion data type, structure, the goal of the clustering task and the application domains are reasons that have to consider when choosing the suitable clustering approach.
10

Development of Partially Supervised Kernel-based Proximity Clustering Frameworks and Their Applications

Graves, Daniel 06 1900 (has links)
The focus of this study is the development and evaluation of a new partially supervised learning framework. This framework belongs to an emerging field in machine learning that augments unsupervised learning processes with some elements of supervision. It is based on proximity fuzzy clustering, where an active learning process is designed to query for the domain knowledge required in the supervision. Furthermore, the framework is extended to the parametric optimization of the kernel function in the proximity fuzzy clustering algorithm, where the goal is to achieve interesting non-spherical cluster structures through a non-linear mapping. It is demonstrated that the performance of kernel-based clustering is sensitive to the selection of these kernel parameters. Proximity hints procured from domain knowledge are exploited in the partially supervised framework. The theoretic developments with proximity fuzzy clustering are evaluated in several interesting and practical applications. One such problem is the clustering of a set of graphs based on their structural and semantic similarity. The segmentation of music is a second problem for proximity fuzzy clustering, where the aim is to determine the points in time, i.e. boundaries, of significant structural changes in the music. Finally, a time series prediction problem using a fuzzy rule-based system is established and evaluated. The antecedents of the rules are constructed by clustering the time series using proximity information in order to localize the behavior of the rule consequents in the architecture. Evaluation of these efforts on both synthetic and real-world data demonstrate that proximity fuzzy clustering is well suited for a variety of problems. / Digital Signals and Image Processing

Page generated in 0.1035 seconds