• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 8
  • 2
  • Tagged with
  • 12
  • 12
  • 12
  • 6
  • 5
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Climatology of warm season heat waves in Saudi Arabia: a time-sensitive approach

Alghamdi, Ali Saeed Arifi January 1900 (has links)
Doctor of Philosophy / Department of Geography / John A. Harrington Jr / The climate of the Middle East is warming and extreme hot temperature events are becoming more common, as observed by the significant upward trends in mean and extreme temperatures during the last few decades. Climate modeling studies suggest that the frequency, intensity, and duration of extreme temperature events are expected to increase as the global and local climate continues to warm. Existing literature about heat waves (HWs) in Saudi Arabia provides information about HW duration using a single index, without considering the observed effects of climate change and the subtropical arid climate. With that in mind, this dissertation provides a series of three stand-alone papers evaluating temporal, geographic, and atmospheric aspects of the character of warm season (May-September) HWs in Saudi Arabia for 1985 to 2014. Chapter 2 examines the temporal behavior(s) of the frequency, duration, and intensity of HWs under the observed recent climate change. Several issues are addressed including the identification of some improved methodological practices for HW indices. A time-sensitive approach to define and detect HWs is proposed and assessed. HW events and their duration are considered as count data; thus, different Poisson models were used for trend detection. Chapter 3 addresses the spatio-temporal patterns of the frequency and intensity of hot days and nights, and HWs. The chapter reemphasizes the importance of considering the on-goings effects of climate warming and applies a novel time-series clustering approach to recognize hot temperature event behavior through time and space. Chapter 4 explores the atmospheric circulation conditions that are associated with warm season HW event occurrence and how different HWs aspects are related to different circulation types. Further, possible teleconnections between HWs and sea surface temperature (SST) anomalies of nearby large bodies are examined. Results from Chapters 2 and 3 detected systematic upward trends in maximum and minimum temperatures at most of the 25 stations, suggesting an on-going change in the climatology of the upper-tail of the frequency distribution. The analysis demonstrated the value of using a time-sensitive approach in studying extreme thermal events. Different patterns were observed over time and space not only across stations but also among extreme temperature events (i.e., hot days and nights, and HWs). The overall results suggest that not only local and regional factors, such as elevation, latitude, land cover, atmospheric humidity, and distance from a large body of water, but also large-scale factors such as atmospheric circulation patterns are responsible for the observed temporal and spatial patterns. Chapter 4 confirmed that as the Indian Summer Monsoon Trough and the Arabian heat low were key atmospheric features related to HW days. SST anomalies seemed to be a more important factor for HWs intensity. Extreme thermal events in Saudi Arabia tended to occur during regional warming due to atmospheric circulation conditions and SSTs teleconnections. This study documents the value of a time-sensitive approach and should initiate further research as some of temporal and spatial variabilities were not fully explained
2

Forecasting With Feature-Based Time Series Clustering

Tingström, Conrad, Åkerblom Svensson, Johan January 2023 (has links)
Time series prediction plays a pivotal role in various areas, including for example finance, weather forecasting, and traffic analysis. In this study, time series of historical sales data from a packaging manufacturer is used to investigate the effects that clustering such data has on forecasting performance. An experiment is carried out in which the time series data is first clustered using two separate approaches: k-means and Self-Organizing Map (SOM). The clustering is feature-based, meaning that characteristics extracted from the time series are used to compute similarity, rather than the raw time series. Then, A set of Long Short-Term Memory models (LSTMs) are trained; one that is trained on the entire dataset (global model), separate models trained on each of the clusters (cluster-based models), and finally a number of models trained on individual time series that are proportionally sampled from the clusters (single models). By evaluating the LSTMs based on Mean Absolute Error (MAE) and Mean Squared Error (MSE), we assess their consistency and predictive potential. The results reveal a trade-off between the consistency and predictive performance of the models. The global LSTM model consistently exhibits more stable performance across all predictions, showcasing its ability to capture the overall patterns in the data. However, the cluster-based LSTM models demonstrate potential for improved predictive performance within specific clusters, albeit with higher variability. This suggests that certain clusters possess distinct characteristics that allow for better predictions within those subsets of the data. Finally, the single LSTM models trained on individual time series, showcase even wider spreads of scores. The analysis suggests that the availability of training data plays a crucial role in the robustness (i.e., the ability to consistently produce similar results) of the forecasting models, with the global model benefiting from a larger training set. The higher variability in performance seen for the models trained with smaller training sets indicates that certain time series may be easier or harder to predict. It seems that the noise that comes with a larger training set can be either beneficial or detrimental to the predictive performance of the forecasting model on any individual time series, depending on the characteristics of that particular sample. Further analysis is required to investigate the factors contributing to the varying performance within each cluster. Exploring feature scores associated with poorly performing clusters and identifying the key features that contribute to better performance in certain clusters could provide valuable insights. Understanding these factors might aid in developing tailored strategies for cluster-specific prediction tasks.
3

Migration Motif: A Spatial-Temporal Pattern Mining Approach for Financial Markets

Du, Xiaoxi 08 April 2009 (has links)
No description available.
4

Some new anomaly detection methods with applications to financial data

Zhao, Zhicong 06 August 2021 (has links)
Novel clustering methods are presented and applied to financial data. First, a scan-statistics method for detecting price point clusters in financial transaction data is considered. The method is applied to Electronic Business Transfer (EBT) transaction data of the Supplemental Nutrition Assistance Program (SNAP). For a given vendor, transaction amounts are fit via maximum likelihood estimation which are then converted to the unit interval via a natural copula transformation. Next, a new Markov type relation for order statistics on the unit interval is developed. The relation is used to characterize the distribution of the minimum exceedance of all copula transformed transaction amounts above an observed order statistic. Conditional on observed order statistics, independent and asymptotically identical indicator functions are constructed and the success probably as a function of the gaps in consecutive order statistics is specified. The success probabilities are shown to be a function of the hazard rate of the transformed transaction distribution. If gaps are smaller than expected, then the corresponding indicator functions are more likely to be one. A scan statistic is then applied to the sequence of indicator functions to detect locations where too many gaps are smaller than expected. These sets of gaps are then flagged as being anomalous price point clusters. It is noted that prominent price point clusters appearing in the data may be a historical vestige of previous versions of the SNAP program involving outdated paper "food stamps". The second part of the project develops a novel clustering method whereby the time series of daily total EBT transaction amounts are clustered by periodicity. The schemeworks by normalizing the time series of daily total transaction amounts for two distinct vendors and taking daily differences in those two series. The difference series is then examined for periodicity via a novel F statistic. We find one may cluster the monthly periodicities of vendors by type of store using the F statistic, a proxy for a distance metric. This may indicate that spending preferences for SNAP benefit recipients varies by day of the month, however, this opens further questions about potential forcing mechanisms and the apparent changing appetites for spending.
5

Structural time series clustering, modeling, and forecasting in the state-space framework

Tang, Fan 15 December 2015 (has links)
This manuscript consists of two papers that formulate novel methodologies pertaining to time series analysis in the state-space framework. In Chapter 1, we introduce an innovative time series forecasting procedure that relies on model-based clustering and model averaging. The clustering algorithm employs a state-space model comprised of three latent structures: a long-term trend component; a seasonal component, to capture recurring global patterns; and an anomaly component, to reflect local perturbations. A two-step clustering algorithm is applied to identify series that are both globally and locally correlated, based on the corresponding smoothed latent structures. For each series in a particular cluster, a set of forecasting models is fit, using covariate series from the same cluster. To fully utilize the cluster information and to improve forecasting for a series of interest, multi-model averaging is employed. We illustrate the proposed technique in an application that involves a collection of monthly disease incidence series. In Chapter 2, to effectively characterize a count time series that arises from a zero-inflated binomial (ZIB) distribution, we propose two classes of statistical models: a class of observation-driven ZIB (ODZIB) models, and a class of parameter-driven ZIB (PDZIB) models. The ODZIB model is formulated in the partial likelihood framework. Common iterative algorithms (Newton-Raphson, Fisher Scoring, and Expectation Maximization) can be used to obtain the maximum partial likelihood estimators (MPLEs). The PDZIB model is formulated in the state-space framework. For parameter estimation, we devise a Monte Carlo Expectation Maximization (MCEM) algorithm, using particle methods to approximate the intractable conditional expectations in the E-step of the algorithm. We investigate the efficacy of the proposed methodology in a simulation study, and illustrate its utility in a practical application pertaining to disease coding.
6

Optimized material flow using unsupervised time series clustering : An experimental study on the just in time supermarket for Volvo powertrain production Skövde.

Darwish, Amena January 2019 (has links)
Machine learning has achieved remarkable performance in many domains, now it promising to solve manufacturing problems — a new ongoing trend of using machine learning in industrial applications. Dealing with the material order demand in manufacturing as time-series sequences, making unsupervised time-series clustering possible to apply. This study aims to evaluate different time-series clustering approaches, algorithms, and distance measures in material flow data. Three different approaches are evaluated; statistical clustering approaches; raw based and shape-based approaches and at last feature-based approach. The objectives are to categorize the materials in the supermarket (intermediate storage area to store materials before assembling the products) into three different flows according to their time-series properties. The experimental shows that feature-based approach is performed best for the data. A features filter is applied to keep the relevant features, that catch the unique characteristics from the data the predicted output. As a conclusion data type, structure, the goal of the clustering task and the application domains are reasons that have to consider when choosing the suitable clustering approach.
7

Development of Partially Supervised Kernel-based Proximity Clustering Frameworks and Their Applications

Graves, Daniel 06 1900 (has links)
The focus of this study is the development and evaluation of a new partially supervised learning framework. This framework belongs to an emerging field in machine learning that augments unsupervised learning processes with some elements of supervision. It is based on proximity fuzzy clustering, where an active learning process is designed to query for the domain knowledge required in the supervision. Furthermore, the framework is extended to the parametric optimization of the kernel function in the proximity fuzzy clustering algorithm, where the goal is to achieve interesting non-spherical cluster structures through a non-linear mapping. It is demonstrated that the performance of kernel-based clustering is sensitive to the selection of these kernel parameters. Proximity hints procured from domain knowledge are exploited in the partially supervised framework. The theoretic developments with proximity fuzzy clustering are evaluated in several interesting and practical applications. One such problem is the clustering of a set of graphs based on their structural and semantic similarity. The segmentation of music is a second problem for proximity fuzzy clustering, where the aim is to determine the points in time, i.e. boundaries, of significant structural changes in the music. Finally, a time series prediction problem using a fuzzy rule-based system is established and evaluated. The antecedents of the rules are constructed by clustering the time series using proximity information in order to localize the behavior of the rule consequents in the architecture. Evaluation of these efforts on both synthetic and real-world data demonstrate that proximity fuzzy clustering is well suited for a variety of problems. / Digital Signals and Image Processing
8

Development of Partially Supervised Kernel-based Proximity Clustering Frameworks and Their Applications

Graves, Daniel Unknown Date
No description available.
9

Dynamická faktorová analýza časových řad / Time series dynamic factor analysis

Slávik, Ľuboš January 2021 (has links)
Táto diplomová práca sa zaoberá novým prístupom k zhlukovaniu časových rád na základe dynamického faktorového modelu. Dynamický faktorový model je technika redukujúca dimenziu a rozširuje klasickú faktorovú analýzu o požiadavku autokorelačnej štruktúry latentných faktorov. Parametre modelu sa odhadujú pomocou EM algoritmu za použitia Kalmanovho filtra a vyhladzovača a taktiež sú aplikované nevyhnutné podmienky na model, aby sa stal identifikovateľným. Po tom, ako je v práci predstavený teoretický koncept prístupu, dynamický faktorový model je aplikovaný na skutočné pozorované časové rady a práca skúma jeho správanie a vlastnosti na jednomesačných meteorologických dátach požiarneho indexu (Fire Weather Index) na 108 požiarnych staniciach umiestnených v Britskej Kolumbii. Postup výpočtu modelu odhadne záťažovú maticu (loadings matrix) spolu so zodpovedajúcim malým počtom latentných faktorov a kovariačnou maticou modelovaných časových rád. Diplomová práca aplikuje k-means zhlukovanie na výslednú záťažovú maticu a ponúka rozdelenie meteorologických staníc do zhlukov založené na redukovanej dimenzionalite pôvodných dát. Vďaka odhadnutým priemerom zhlukov a odhadnutým latentným faktorom je možné získať aj priemerné trendy každého zhluku. Následne sú dosiahnuté výsledky porovnané s výsledkami získanými na dátach z rovnakých staníc avšak iného mesiaca, aby sa stanovila stabilita zhlukovania. Práca sa taktiež zaoberá efektom varimax rotácie záťažovej matice. Diplomová práca naviac navrhuje metódu detekovania odľahlých časových rád založenú na odhadnutej kovariačnej matici modelu a rozoberá dôsledky odľahlých hodnôt na odhanutý model.
10

Time-series long-term forcasting for A/B tests

Jaunzems, Davis January 2016 (has links)
Den tekniska utvecklingen av datorenheter och kommunikationsverktyg har skapat möjligheter att lagra och bearbeta större mängder information än någonsin tidigare. För forskare är det ett sätt att göra mer exakta vetenskapliga upptäckter, för företag är det ett verktyg för att bättre förstå sina kunder, sina produkter och att skapa fördelar gentemot sina konkurrenter. Inom industrin har A/B-testning blivit ett viktigt och vedertaget sätt att skaffa kunskaper som bidrar till att kunna fatta datadrivna beslut. A/B-test är en jämförelse av två eller flera versioner för att avgöra vilken som fungerar bäst enligt förutbestämda mätningar. I kombination med informationsutvinning och statistisk analys gör dessa tester det möjligt att besvara ett antal viktiga frågor och bidra till övergången från att "vi tror" till att "vi vet". Samtidigt kan dåliga testfall ha negativ inverkan på företags affärer och kan också leda till att användare upplever testerna negativt. Det är skälet till varför det är viktigt att kunna förutsäga A/B-testets långsiktiga effekter, utvunna ur kortsiktiga data. I denna rapport är A/B-tester och de prognoser de skapar undersökta genom att använda univariat tidsserieanalys. Men på grund av den korta tidsperioden och det stora urvalet, är det en stor utmaning att ge korrekta långtidsprognoser. Det är en kvantitativ och empirisk studie som använder verkliga data som tagits från ett socialt spelutvecklingsbolag, King Digital Entertainment PLC (King.com). Först analyseras och förbereds data genom en serie olika steg. Tidsserieprognoser har funnits i generationer. Därför görs en analys och noggrannhetsjämförelse av befintliga prognosmodeller, så som medelvärdesprognos, ARIMA och Artificial Neural Networks. Resultaten av analysen på verkliga data visar liknande resultat som andra forskare har funnit för långsiktiga prognoser med kortsiktiga data. För att förbättra exaktheten i prognosen föreslås en metod med tidsseriekluster. Metoden utnyttjar likheten mellan tidsserier genom Dynamic Time Warping och skapar separata kluster av prognosmodeller. Klustren väljs med hög noggrannhet med hjälp av Random Forest klassificering och de långa tidsserieintervallen säkras genom att använda historiska tester och en Markov Chain. Den föreslagna metoden visar överlägsna resultat i jämförelse med befintliga modeller och kan användas för att erhålla långsiktiga prognoser för A/B-tester. / The technological development of computing devices and communication tools has allowed to store and process more information than ever before. For researchers it is a means of making more accurate scientific discoveries, for companies it is a way of better understanding their clients, products and gain an edge over the competitors. In the industry A/B testing is becoming an important and a common way of obtaining insights that help to make data-driven decisions. A/B test is a comparison of two or more versions to determine which is performing better according to predetermined measurements. In combination of data mining and statistical analysis, these tests allow to answer important questions and help to transition from the state of “we think” to “we know”. Nevertheless, running bad test cases can have negative impact on businesses and can result in bad user experience. That is why it is important to be able to forecast A/B test long-term effects from short-term data. In this report A/B tests and their forecasting is looked at using the univariate time-series analysis. However, because of the short duration and high diversity, it poses a great challenge in providing accurate long-term forecasts. This is a quantitative and empirical study that uses real-world data set from a social game development company King Digital Entertainment PLC(King.com). First through series of steps the data are analysed and pre-processed. Time-series forecasting has been around for generations. That is why an analysis and accuracy comparison of existing forecasting models, like, mean forecast, ARIMA and Artificial Neural Networks, is carried out. The results on real data set show similar results that other researchers have found for long-term forecasts with short-term data. To improve the forecasting accuracy a time-series clustering method is proposed. The method utilizes similarity between time-series through Dynamic Time Warping, and trains separate cluster forecasting models. The clusters are chosen with high accuracy using Random Forest classifier, and certainty about time-series long-term range is obtained by using historical tests and a Markov Chain. The proposed method shows superior results against existing models, and can be used to obtain long-term forecasts for A/B tests.

Page generated in 0.1108 seconds