Global ETD Search

51	Data Mining in Small Business / Data Mining in Small Business Sabovčik, František January 2018 (has links) Tato práce si klade za cíl vyhodnotit techniky získávání znalostí pro využití v prostředí malého podnikání. Po prozkoumání dat a konzultace s doménovymi experty byly vybrány dvě úlohy: analyza nákupního košíku a predikce prodejů. Pro analyzu nákupního košíku byl využit algoritmus Relim pro vyhledávání častych itemsetů a metriky určující zajímavost asociačních pravidel. Pro úlohu predikce prodejů byl implementován dekompoziční model, SARIMA, MARS a neuronové sítě s časovym oknem. Modely byly vyhodnoceny. Pomocí optimalizace hyper-parametrů bylo dosaženo přijatelnych vysledků. Oproti předpokladům nedošlo při dodání dat o počasí a využití nelineárních modelů ke zlepšení oproti SARIMA. Predikce byla implementována jako služba na straně serveru pro testování v produkčním prostředí.
52	Time to Strike: Intelligent Detection of Receptive Clients : Predicting a Contractual Expiration using Time Series Forecasting Alklid, Jonathan January 2020 (has links) In recent years with the advances in Machine Learning and Artificial Intelligence, the demand for ever smarter automation solutions could seem insatiable. One such demand was identified by Fortnox AB, but undoubtedly shared by many other industries dealing with contractual services, who were looking for an intelligent solution capable of predicting the expiration date of a contractual period. As there was no clear evidence suggesting that Machine Learning models were capable of learning the patterns necessary to predict a contract's expiration, it was deemed desirable to determine subject feasibility while also investigating whether it would perform better than a commonplace rule-based solution, something that Fortnox had already investigated in the past. To do this, two different solutions capable of predicting a contractual expiration were implemented. The first one was a rule-based solution that was used as a measuring device, and the second was a Machine Learning-based solution that featured Tree Decision classifier as well as Neural Network models. The results suggest that Machine Learning models are indeed capable of learning and recognizing patterns relevant to the problem, and with an average accuracy generally being on the high end. Unfortunately, due to a lack of available data to use for testing and training, the results were too inconclusive to make a reliable assessment of overall accuracy beyond the learning capability. The conclusion of the study is that Machine Learning-based solutions show promising results, but with the caveat that the results should likely be seen as indicative of overall viability rather than representative of actual performance. Machine Learning Artificial Intelligence Time Series Time Series Forecasting Controlled Experiment sklearn Contractual Computer Sciences Datavetenskap (datalogi) Other Computer and Information Science Annan data- och informationsvetenskap Computer Engineering Datorteknik
53	Forecasting anomalies in time series data from online production environments Sseguya, Raymond January 2020 (has links) Anomaly detection on time series forecasts can be used by many industries in especially forewarning systems that can predict anomalies before they happen. Infor (Sweden) AB is software company that provides Enterprise Resource Planning cloud solutions. Infor is interested in predicting anomalies in their data and that is the motivation for this thesis work. The general idea is firstly to forecast the time series and then secondly detect and classify anomalies on the forecast. The first part is time series forecasting and the second part is anomaly detection and classification done on the forecasted values. In this thesis work, the time series forecasting to predict anomalous behaviour is done using two strategies namely the recursive strategy and the direct strategy. The recursive strategy includes two methods; AutoRegressive Integrated Moving Average and Neural Network AutoRegression. The direct strategy is done with ForecastML-eXtreme Gradient Boosting. Then the three methods are compared concerning performance of forecasting. The anomaly detection and classification is done by setting a decision rule based on a threshold. In this thesis work, since the true anomaly thresholds were not previously known, an arbitrary initial anomaly threshold is set by using a combination of statistical methods for outlier detection and then human judgement by the company commissioners. These statistical methods include Seasonal and Trend decomposition using Loess + InterQuartile Range, Twitter + InterQuartile Range and Twitter + GESD (Generalized Extreme Studentized Deviate). After defining what an anomaly threshold is in the usage context of Infor (Sweden) AB, then a decision rule is set and used to classify anomalies in time series forecasts. The results from comparing the classifications of the forecasts from the three time series forecasting methods are unfortunate and no recommendation is made concerning what model or algorithm to be used by Infor (Sweden) AB. However, the thesis work concludes by recommending other methods that can be tried in future research. Infor (Sweden) AB time series forecasting anomaly detection ARIMA neural network autoregression eXtreme Gradient Boosting package Computer and Information Sciences Data- och informationsvetenskap Probability Theory and Statistics Sannolikhetsteori och statistik
54	Short-Term electricity consumption prediction: Elområde 4, Sweden Kothapalli, Anil Kumar January 2021 (has links) This Thesis work is part of course work for the Masters Program in Data Science at LTU. The focus of this work is mainly to review the literature published to identify state-of-art methodologies applied to predict short-term electricity consumption. This includes the exploration of features and models as well-as the discussion of the results attained. Identify opportunities to improve the forecast results for southern Elområde(bidding area)4, Sweden. The application of different modern methods to forecast electricity consumption has been studied and experimented with. This work adapted the CRISP-DM, a Data Science methodology. Time series prediction Time series forecasting CRISP-DM Data Science Machine Learning Sequence Model SMHI SCB ENTSO-E Computer Sciences Datavetenskap (datalogi)
55	Large-Scale Time Series Analytics Hahmann, Martin, Hartmann, Claudio, Kegel, Lars, Lehner, Wolfgang 16 June 2023 (has links) More and more data is gathered every day and time series are a major part of it. Due to the usefulness of this type of data, it is analyzed in many application domains. While there already exists a broad variety of methods for this task, there is still a lack of approaches that address new requirements brought up by large-scale time series data like cross-domain usage or compensation of missing data. In this paper, we address these issues, by presenting novel approaches for generating and forecasting large-scale time series data. info:eu-repo/classification/ddc/004 ddc:004
56	[pt] MODELO VARIABLE STEP-SIZE EVOLVING PARTICIPATORY LEARNING WITH KERNEL RECURSIVE LEAST SQUARES APLICADO À PREVISÃO DE PREÇOS DO ÓLEO DIESEL NO BRASIL / [en] VARIABLE STEP-SIZE EVOLVING PARTICIPATORY LEARNING WITH KERNEL RECURSIVE LEAST SQUARES MODEL APPLIED TO GAS PRICES FORECASTING IN BRAZIL EDUARDO RAVAGLIA CAMPOS QUEIROZ 30 April 2021 (has links) [pt] Um modelo de previsão é uma ferramenta indispensável nos negócios, ajudando na tomada de decisões, seja a curto, médio ou longo prazo. Neste contexto, a implementação de técnicas de aprendizagem de máquina em modelos de previsão de séries temporais assume notória relevância, visto que o processamento da informação e a extração de conhecimento são cada vez mais exigidos de forma eficiente e dinâmica. Este trabalho desenvolve um modelo denominado Variable Step-Size evolving Participatory Learning with Kernel Recursive Least Squares, VS-ePL-KRLS, aplicado à previsão de preços do óleo diesel S500 e S10. O modelo apresentado demonstra uma melhor acurácia em comparação com os modelos análogos na literatura, sem perda de desempenho computacional para todas as séries temporais analisadas. / [en] A prediction model is an indispensable tool in business, helping to make decisions, whether in the short, medium, or long term. In this context, the implementation of machine learning techniques in time series forecasting models has a notorious relevance, as information processing and efficient and dynamic knowledge uncovering are increasingly demanded. This work develops a model called Variable Step-Size evolving Participatory Learning with Kernel Recursive Least Squares, VS-ePL-KRLS, applied to the forecast of weekly prices for S500 and S10 diesel oil, at the Brazilian level, for biweekly and monthly horizons. The presented model demonstrates a better accuracy compared with analogous models in the literature, without loss of computational performance for all time series analyzed. [pt] PREVISAO DE SERIES TEMPORAIS [pt] PASSO DE ADAPTACAO VARIAVEL [pt] MODELOS NEBULOSOS EVOLUTIVOS [en] TIME SERIES FORECASTING [en] VARIABLE STEP SIZE [en] EVOLVING FUZZY MODELS
57	[en] E-AUTOMFIS: INTERPRETABLE MODEL FOR TIME SERIES FORECASTING USING ENSEMBLE LEARNING OF FUZZY INFERENCE SYSTEM / [pt] E-AUTOMFIS: MODELO INTERPRETÁVEL PARA PREVISÃO DE SÉRIES MULTIVARIADAS USANDO COMITÊS DE SISTEMAS DE INFERÊNCIA FUZZY THIAGO MEDEIROS CARVALHO 17 June 2021 (has links) [pt] Por definição, a série temporal representa o comportamento de uma variável em função do tempo. Para o processo de previsão de séries, o modelo deve ser capaz de aprender a dinâmica temporal das variáveis para obter valores futuros. Contudo, prever séries temporais com exatidão é uma tarefa que vai além de escolher o modelo mais complexo, e portanto a etapa de análise é um processo fundamental para orientar o ajuste do modelo. Especificamente em problemas multivariados, o AutoMFIS é um modelo baseado na lógica fuzzy, desenvolvido para introduzir uma explicabilidade dos resultados através de regras semanticamente compreensíveis. Mesmo com características promissoras e positivas, este sistema possui limitações que tornam sua utilização impraticável em problemas com bases de dados com alta dimensionalidade. E com a presença cada vez maior de bases de dados mais volumosas, é necessário que a síntese automática de sistemas fuzzy seja adaptada para abranger essa nova classe de problemas de previsão. Por conta desta necessidade, a presente dissertação propõe a extensão do modelo AutoMFIS para a previsão de séries temporais com alta dimensionalidade, chamado de e-AutoMFIS. Apresentase uma nova metodologia, baseada em comitê de previsores, para o aprendizado distribuído de geração de regras fuzzy. Neste trabalho, são descritas as características importantes do modelo proposto, salientando as modificações realizadas para aprimorar tanto a previsão quanto a interpretabilidade do sistema. Além disso, também é avaliado o seu desempenho em problemas reais, comparando-se a acurácia dos resultados com as de outras técnicas descritas na literatura. Por fim, em cada problema selecionado também é considerado o aspecto da interpretabilidade, discutindo-se os critérios utilizados para a análise de explicabilidade. / [en] By definition, the time series represents the behavior of a variable as a time function. For the series forecasting process, the model must be able to learn the temporal dynamics of the variables in order to obtain consistent future values. However, an accurate time series prediction is a task that goes beyond choosing the most complex (or promising) model that is applicable to the type of problem, and therefore the analysis step is a fundamental procedure to guide the adaptation of a model. Specifically, in multivariate problems, AutoMFIS is a model based on fuzzy logic, developed not only to give accurate forecasts but also to introduce the explainability of results through semantically understandable rules. Even with such promising characteristics, this system has shown practical limitations in problems that involve datasets of high dimensionality. With the increasing demand formethods to deal with large datasets, it should be great that approaches for the automatic synthesis of fuzzy systems could be adapted to cover a new class of forecasting problems. This dissertation proposes an extension of the base model AutoMFIS modeling method for time series forecasting with high dimensionality data, named as e-AutoMFIS. Based on the Ensemble learning theory, this new methodology applies distributed learning to generate fuzzy rules. The main characteristics of the proposed model are described, highlighting the changes in order to improve both the accuracy and the interpretability of the system. The proposed model is also evaluated in different case studies, in which the results are compared in terms of accuracy against the results produced by other methods in the literature. In addition, in each selected problem, the aspect of interpretability is also assessed, which is essential for explainability evaluation. [pt] BASE DE DADOS [pt] COMITE DE PREVISORES [pt] PREVISAO DE SERIES MULTIVARIADAS [pt] INTERPRETABILIDADE [pt] SISTEMA DE INFERENCIA FUZZY [en] BIG DATA [en] ENSEMBLE METHOD [en] INTERPRETABILITY [en] FUZZY INFERENCE SYSTEM
58	Forecasting checking account balance : Using supervised machine learning Dannelind, Martin January 2022 (has links) The introduction of open banking has made it possible for companies to build the next generation of applications based on transactional data. Enabling economic forecasts which private individuals can use to make responsible financial decisions. This project investigated forecasting account balances using supervised learning. 7 different regression models were run on transactional data from 377 anonymised checking accounts split into subgroups. The results concluded that multivariate XGBoost optimised with feature selection was the best performing forecasting model and the subgroup with recurring income transactions was easiest to forecast. Based on the result from this project it can be concluded that a viable option to forecast account balances is to split the transactional data into subgroups and forecast them separately. Minimising the errors given by certain random, infrequent and large types of transactions. Time series forecasting account balance forecasting economic predicition Python GRU LSTM RNN XGBoost prophet checking account
59	Clustering and forecasting for rain attenuation time series data Li, Jing January 2017 (has links) Clustering is one of unsupervised learning algorithm to group similar objects into the same cluster and the objects in the same cluster are more similar to each other than those in the other clusters. Forecasting is making prediction based on the past data and efficient artificial intelligence models to predict data developing tendency, which can help to make appropriate decisions ahead. The datasets used in this thesis are the signal attenuation time series data from the microwave networks. Microwave networks are communication systems to transmit information between two fixed locations on the earth. They can support increasing capacity demands of mobile networks and play an important role in next generation wireless communication technology. But inherent vulnerability to random fluctuation such as rainfall will cause significant network performance degradation. In this thesis, K-means, Fuzzy c-means and 2-state Hidden Markov Model are used to develop one step and two step rain attenuation data clustering models. The forecasting models are designed based on k-nearest neighbor method and implemented with linear regression to predict the real-time rain attenuation in order to help microwave transport networks mitigate rain impact, make proper decisions ahead of time and improve the general performance. / Clustering is een van de unsupervised learning algorithmen om groep soortgelijke objecten in dezelfde cluster en de objecten in dezelfde cluster zijn meer vergelijkbaar met elkaar dan die in de andere clusters. Prognoser är att göra förutspårningar baserade på övergående data och effektiva artificiella intelligensmodeller för att förutspå datautveckling, som kan hjälpa till att fatta lämpliga beslut. Dataseten som används i denna avhandling är signaldämpningstidsseriedata från mikrovågsnätverket. Mikrovågsnät är kommunikationssystem för att överföra information mellan två fasta platser på jorden. De kan stödja ökade kapacitetsbehov i mobilnät och spela en viktig roll i nästa generationens trådlösa kommunikationsteknik. Men inneboende sårbarhet för slumpmässig fluktuering som nedbörd kommer att orsaka betydande nätverksförstöring. I den här avhandlingen används K-medel, Fuzzy c-medel och 2-state Hidden Markov Model för att utveckla ett steg och tvåstegs regen dämpning dataklyvningsmodeller. Prognosmodellerna är utformade utifrån k-närmaste granne-metoden och implementeras med linjär regression för att förutsäga realtidsdämpning för att hjälpa mikrovågstransportnät att mildra regnpåverkan, göra rätt beslut före tid och förbättra den allmänna prestandan. Computer Sciences Datavetenskap (datalogi)
60	Federated Learning for Time Series Forecasting Using LSTM Networks: Exploiting Similarities Through Clustering / Federerad inlärning för tidserieprognos genom LSTM-nätverk: utnyttjande av likheter genom klustring Díaz González, Fernando January 2019 (has links) Federated learning poses a statistical challenge when training on highly heterogeneous sequence data. For example, time-series telecom data collected over long intervals regularly shows mixed fluctuations and patterns. These distinct distributions are an inconvenience when a node not only plans to contribute to the creation of the global model but also plans to apply it on its local dataset. In this scenario, adopting a one-fits-all approach might be inadequate, even when using state-of-the-art machine learning techniques for time series forecasting, such as Long Short-Term Memory (LSTM) networks, which have proven to be able to capture many idiosyncrasies and generalise to new patterns. In this work, we show that by clustering the clients using these patterns and selectively aggregating their updates in different global models can improve local performance with minimal overhead, as we demonstrate through experiments using realworld time series datasets and a basic LSTM model. / Federated Learning utgör en statistisk utmaning vid träning med starkt heterogen sekvensdata. Till exempel så uppvisar tidsseriedata inom telekomdomänen blandade variationer och mönster över längre tidsintervall. Dessa distinkta fördelningar utgör en utmaning när en nod inte bara ska bidra till skapandet av en global modell utan även ämnar applicera denna modell på sin lokala datamängd. Att i detta scenario införa en global modell som ska passa alla kan visa sig vara otillräckligt, även om vi använder oss av de mest framgångsrika modellerna inom maskininlärning för tidsserieprognoser, Long Short-Term Memory (LSTM) nätverk, vilka visat sig kunna fånga komplexa mönster och generalisera väl till nya mönster. I detta arbete visar vi att genom att klustra klienterna med hjälp av dessa mönster och selektivt aggregera deras uppdateringar i olika globala modeller kan vi uppnå förbättringar av den lokal prestandan med minimala kostnader, vilket vi demonstrerar genom experiment med riktigt tidsseriedata och en grundläggande LSTM-modell. Federated Learning Time Series Forecasting Clustering Time Series Feature Extraction Recurrent Neural Networks Long Short-Term Memory Computer and Information Sciences Data- och informationsvetenskap

Search results