• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 58
  • 14
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 96
  • 96
  • 96
  • 39
  • 38
  • 29
  • 27
  • 26
  • 25
  • 24
  • 18
  • 16
  • 16
  • 14
  • 13
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Predictability of Nonstationary Time Series using Wavelet and Empirical Mode Decomposition Based ARMA Models

Lanka, Karthikeyan January 2013 (has links) (PDF)
The idea of time series forecasting techniques is that the past has certain information about future. So, the question of how the information is encoded in the past can be interpreted and later used to extrapolate events of future constitute the crux of time series analysis and forecasting. Several methods such as qualitative techniques (e.g., Delphi method), causal techniques (e.g., least squares regression), quantitative techniques (e.g., smoothing method, time series models) have been developed in the past in which the concept lies in establishing a model either theoretically or mathematically from past observations and estimate future from it. Of all the models, time series methods such as autoregressive moving average (ARMA) process have gained popularity because of their simplicity in implementation and accuracy in obtaining forecasts. But, these models were formulated based on certain properties that a time series is assumed to possess. Classical decomposition techniques were developed to supplement the requirements of time series models. These methods try to define a time series in terms of simple patterns called trend, cyclical and seasonal patterns along with noise. So, the idea of decomposing a time series into component patterns, later modeling each component using forecasting processes and finally combining the component forecasts to obtain actual time series predictions yielded superior performance over standard forecasting techniques. All these methods involve basic principle of moving average computation. But, the developed classical decomposition methods are disadvantageous in terms of containing fixed number of components for any time series, data independent decompositions. During moving average computation, edges of time series might not get modeled properly which affects long range forecasting. So, these issues are to be addressed by more efficient and advanced decomposition techniques such as Wavelets and Empirical Mode Decomposition (EMD). Wavelets and EMD are some of the most innovative concepts considered in time series analysis and are focused on processing nonlinear and nonstationary time series. Hence, this research has been undertaken to ascertain the predictability of nonstationary time series using wavelet and Empirical Mode Decomposition (EMD) based ARMA models. The development of wavelets has been made based on concepts of Fourier analysis and Window Fourier Transform. In accordance with this, initially, the necessity of involving the advent of wavelets has been presented. This is followed by the discussion regarding the advantages that are provided by wavelets. Primarily, the wavelets were defined in the sense of continuous time series. Later, in order to match the real world requirements, wavelets analysis has been defined in discrete scenario which is called as Discrete Wavelet Transform (DWT). The current thesis utilized DWT for performing time series decomposition. The detailed discussion regarding the theory behind time series decomposition is presented in the thesis. This is followed by description regarding mathematical viewpoint of time series decomposition using DWT, which involves decomposition algorithm. EMD also comes under same class as wavelets in the consequence of time series decomposition. EMD is developed out of the fact that most of the time series in nature contain multiple frequencies leading to existence of different scales simultaneously. This method, when compared to standard Fourier analysis and wavelet algorithms, has greater scope of adaptation in processing various nonstationary time series. The method involves decomposing any complicated time series into a very small number of finite empirical modes (IMFs-Intrinsic Mode Functions), where each mode contains information of the original time series. The algorithm of time series decomposition using EMD is presented post conceptual elucidation in the current thesis. Later, the proposed time series forecasting algorithm that couples EMD and ARMA model is presented that even considers the number of time steps ahead of which forecasting needs to be performed. In order to test the methodologies of wavelet and EMD based algorithms for prediction of time series with non stationarity, series of streamflow data from USA and rainfall data from India are used in the study. Four non-stationary streamflow sites (USGS data resources) of monthly total volumes and two non-stationary gridded rainfall sites (IMD) of monthly total rainfall are considered for the study. The predictability by the proposed algorithm is checked in two scenarios, first being six months ahead forecast and the second being twelve months ahead forecast. Normalized Root Mean Square Error (NRMSE) and Nash Sutcliffe Efficiency Index (Ef) are considered to evaluate the performance of the proposed techniques. Based on the performance measures, the results indicate that wavelet based analyses generate good variations in the case of six months ahead forecast maintaining harmony with the observed values at most of the sites. Although the methods are observed to capture the minima of the time series effectively both in the case of six and twelve months ahead predictions, better forecasts are obtained with wavelet based method over EMD based method in the case of twelve months ahead predictions. It is therefore inferred that wavelet based method has better prediction capabilities over EMD based method despite some of the limitations of time series methods and the manner in which decomposition takes place. Finally, the study concludes that the wavelet based time series algorithm could be used to model events such as droughts with reasonable accuracy. Also, some modifications that could be made in the model have been suggested which can extend the scope of applicability to other areas in the field of hydrology.
72

A comparative study between algorithms for time series forecasting on customer prediction : An investigation into the performance of ARIMA, RNN, LSTM, TCN and HMM

Almqvist, Olof January 2019 (has links)
Time series prediction is one of the main areas of statistics and machine learning. In 2018 the two new algorithms higher order hidden Markov model and temporal convolutional network were proposed and emerged as challengers to the more traditional recurrent neural network and long-short term memory network as well as the autoregressive integrated moving average (ARIMA). In this study most major algorithms together with recent innovations for time series forecasting is trained and evaluated on two datasets from the theme park industry with the aim of predicting future number of visitors. To develop models, Python libraries Keras and Statsmodels were used. Results from this thesis show that the neural network models are slightly better than ARIMA and the hidden Markov model, and that the temporal convolutional network do not perform significantly better than the recurrent or long-short term memory networks although having the lowest prediction error on one of the datasets. Interestingly, the Markov model performed worse than all neural network models even when using no independent variables.
73

Metodologia evolutiva para previsão inteligente de séries temporais sazonais baseada em espaço de estados não-observáveis / EVOLUTIONARY METHODOLOGY FOR INTELLIGENT FORECAST SERIES SEASONAL TEMPORAL STATE SPACE-BASED NON-OBSERVABLE

Rodrigues Júnior, Selmo Eduardo 26 January 2017 (has links)
Submitted by Rosivalda Pereira (mrs.pereira@ufma.br) on 2017-07-03T18:32:31Z No. of bitstreams: 1 SelmoRodrigues.pdf: 1374245 bytes, checksum: 96afcfa04ba5cc18c4db55e4c92cdf23 (MD5) / Made available in DSpace on 2017-07-03T18:32:31Z (GMT). No. of bitstreams: 1 SelmoRodrigues.pdf: 1374245 bytes, checksum: 96afcfa04ba5cc18c4db55e4c92cdf23 (MD5) Previous issue date: 2017-01-26 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / This paper proposes a new methodology for modelling based on an evolving Neuro-Fuzzy Network Takagi-Sugeno (NFN-TS) for seasonal time series forecasting. The NFN-TS use the unobservable components extracted from the time series to evolve, i.e., to adapt and to adjust its structure, where the number of fuzzy rules of this network can increase or reduced according the components behavior. The method used to extract the components is a recursive version developed in this paper based on the Spectral Singular Analysis (SSA) technique. The proposed methodology has the principle divide to conquer, i.e., it divides a problem into easier subproblems, forecasting separately each component because they present dynamic behaviors that are simpler to forecast. The consequent propositions of fuzzy rules are linear state space models, where the states are the unobservable components data. When there are available observations from the time series, the training stage of NFN-TS is performed, i.e., the NFN-TS evolves its structure and adapts its parameters to carry out the mapping between the components data and the available sample of original time series. On the other hand, if this observation is not available, the network considers the forecasting stage, keeping its structure fixed and using the states of consequent fuzzy rules to feedback the components data to NFN-TS. The NFN-TS was evaluated and compared with other recent and traditional techniques for forecasting seasonal time series, obtaining competitive and advantageous results in relation to other papers. This paper also presents a case study of proposed methodology for real-time detection of anomalies based on a patient’s electrocardiogram data. / Esse trabalho propõe uma nova metodologia para modelagem baseada em uma Rede Neuro- Fuzzy Takagi-Sugeno (RNF-TS) evolutiva para a previsão de séries temporais sazonais. A RNF-TS considera as componentes não-observáveis extraídas a partir da série para evoluir, ou seja, adaptar e ajustar sua estrutura, sendo que a quantidade de regras fuzzy dessa rede pode aumentar ou ser reduzida conforme o comportamento das componentes. O método utilizado para extrair as componentes é uma versão recursiva desenvolvida nessa pesquisa baseada na técnica de Análise Espectral Singular (AES). A metodologia proposta tem como princípio dividir para conquistar, isto é, dividir um problema em subproblemas mais fáceis de lidar, realizando a previsão separadamente de cada componente já que apresentam comportamentos dinâmicos mais simples de prever. As proposições do consequente das regras fuzzy são modelos lineares no espaço de estados, sendo que os estados são os próprios dados das componentes não-observáveis. Quando há observações disponíveis da série temporal, o estágio de treinamento da RNF-TS é realizado, ou seja, a RNF-TS evolui sua estrutura e adapta seus parâmetros para realizar o mapeamento entre os dados das componentes e a amostra disponível da série temporal original. Caso contrário, se essa observação não está disponível, a rede aciona o estágio de previsão, mantendo sua estrutura fixa e usando os estados dos consequentes das regras fuzzy para realimentar os dados das componentes para a RNF-TS. A RNF-TS foi avaliada e comparada com outras técnicas recentes e tradicionais para previsão de séries temporais sazonais, obtendo resultados competitivos e vantajosos em relação a outras pesquisas. Este trabalho apresenta também um estudo de caso da metodologia proposta para detecção em tempo-real de anomalias baseada em dados de eletrocardiogramas de um paciente.
74

Real-time forecasting of dietary habits and user health using Federated Learning with privacy guarantees

Horchidan, Sonia-Florina January 2020 (has links)
Modern health self-monitoring devices and applications, such as Fitbit and MyFitnessPal, empower users to take concrete actions and set fitness and lifestyle goals based on their recorded trends and statistics. Predicting such trends is beneficial in the road of achieving long-time targets, as the individuals can adjust their diets and habits at any point to guarantee success. The design and implementation of such a system, which also respects user privacy, is the main objective of our work.This application is modelled as a time-series forecasting problem. Given the historical data of users, we aim to predict their eating and lifestyle habits in real-time. We apply the federated learning paradigm to our use-case be- cause of the highly-distributed nature of our data and the privacy concerns of such sensitive recorded information. However, federated learning from het- erogeneous sequences of data can be challenging, as even state-of-the-art ma- chine learning techniques for time-series forecasting can encounter difficulties when learning from very irregular data sequences. Specifically, in the pro- posed healthcare scenario, the machine learning algorithms might fail to cater to users with unique dietary patterns.In this work, we implement a two-step streaming clustering mechanism and group clients that exhibit similar eating and fitness behaviours. The con- ducted experiments prove that learning federatively in this context can achieve very high prediction accuracy, as our predictions are no more than 0.025% far from the ground truth value with respect to the range of each feature. Training separate models for each group of users is shown to be beneficial, especially in terms of the training time, but it is highly dependent on the parameters used for the models and the training process. Our experiments conclude that the configuration used for the general federated model cannot be applied to the clusters of data. However, a decrease in prediction error of more than 45% can be achieved, given the parameters are optimized for each case.Lastly, this work tackles the problem of data privacy by applying state-of- the-art differential privacy techniques. Our empirical study shows that noising the gradients sent to the server is unsuitable for small datasets and cancels out the benefits obtained by prior users’ clustering. On the other hand, noising the training data achieves remarkable results, obtaining a differential privacy level corresponding to an epsilon value of 0.1 with an increase in the observed mean absolute error by a factor of only 0.21. / Moderna apparater och applikationer för självövervakning av hälsa, som Fitbit och MyFitnessPal, ger användarna möjlighet att vidta konkreta åtgärder och sätta fitness- och livsstilsmål baserat på deras dokumenterade trender och statistik. Att förutsäga sådana trender är fördelaktigt för att uppnå långtidsmål, eftersom individerna kan anpassa sina dieter och vanor när som helst för att garantera framgång.Utformningen och implementeringen av ett sådant system, som dessutom respekterar användarnas integritet, är huvudmålet för vårt arbete. Denna appli- kation är modellerad som ett tidsserieprognosproblem. Med avseende på an- vändarnas historiska data är målet att förutsäga deras matvanor och livsstilsva- nor i realtid. Vi tillämpar det federerade inlärningsparadigmet på vårt använd- ningsfall på grund av den mycket distribuerade karaktären av vår data och in- tegritetsproblemen för sådan känslig bokförd information. Federerade lärande från heterogena datasekvenser kan emellertid vara utmanande, eftersom även de modernaste maskininlärningstekniker för tidsserieprognoser kan stöta på svårigheter när de lär sig från mycket oregelbundna datasekvenser. Specifikt i det föreslagna sjukvårdsscenariot kan maskininlärningsalgoritmerna misslyc- kas med att förse användare med unika dietmönster.I detta arbete implementerar vi en tvåstegsströmmande klustermekanism och grupperar användare som uppvisar liknande ät- och fitnessbeteenden. De genomförda experimenten visar att federerade lärande i detta sammanhang kan uppnå mycket hög nogrannhet i förutsägelse, eftersom våra förutsägelser in- te är mer än 0,025% ifrån det sanna värdet med avseende på intervallet för varje funktion. Träning av separata modeller för varje grupp användare visar sig vara fördelaktigt, särskilt gällande träningstiden, men det är mycket be- roende av parametrarna som används för modellerna och träningsprocessen. Våra experiment drar slutsatsen att konfigurationen som används för den all- männa federerade modellen inte kan tillämpas på dataklusterna. Dock kan en minskning av förutsägelsefel på mer än 45% uppnås, givet att parametrarna är optimerade för varje fall.Slutligen hanteras problemet med datasekretess genom att tillämpa bästa tillgängliga differentiell integritetsteknik. Vår empiriska studie visar att adde- ra brus till gradienter som skickas till servern är olämpliga för liten data och avbryter fördelarna med tidigare användares kluster. Däremot, genom att ad- dera brus till träningsdata uppnås anmärkningsvärda resultat. En differentierad integritetsnivå motsvarande ett epsilonvärde på 0,1 med en ökning av det ob- serverade genomsnittliga absoluta felet med en faktor på endast 0,21 erhölls.
75

Statistical And Spatial Approaches To Marina Master Plan For Turkey

Karanci, Ayse 01 February 2011 (has links) (PDF)
Turkey, with its climate, protected bays, cultural and environmental resources is an ideal place for yacht tourism. Subsequently, yacht tourism is increasing consistently. Yacht tourism can cause unmitigated development and environmental concerns when aiming to achieve tourist satisfaction. As the demand for yacht tourism intensifies, sustainable development strategies are needed to maximize natural, cultural and economic benefits. Integration of forecasts to the strategic planning is necessary for sustainable and use of the coastal resources. In this study two different quantitative forecasting techniques - Exponential smoothing and Auto-Regressive Integrated Moving Average (ARIMA) methods were used to estimate the demand for yacht berthing capacity demand till 2030 in Turkey. Based on environmental, socio-economic and geographic data and the opinions gathered from stakeholders such as marina operators, local communities and government officials an allocation model was developed for the successful allocation of the predicted demand seeking social and economical growth while preserving the coastal environment. AHP was used to identify and evaluate the development, social and environmental and geographic priorities. Aiming a dynamic plan which is responsive to both national and international developments in yacht tourism, potential investment areas were determined for the investments required to accommodate the future demand. This study provides a multi dimensioned point of view to planning problem and highlights the need for sustainable and dynamic planning at delicate and high demand areas such as coasts.
76

Υπολογιστική νοημοσύνη στην οικονομία και τη θεωρία παιγνίων

Παυλίδης, Νίκος 09 October 2008 (has links)
Η διατριβή πραγματεύεται το αντικείμενο της Υπολογιστικής Νοημοσύνης στην Οικονομική και Χρηματοοικονομική επιστήμη. Στο πρώτο μέρος της διατριβής αναπτύσσονται μέθοδοι ομαδοποίησης και υπολογιστικής νοημοσύνης για τη μοντελοποίηση και πρόβλεψη χρονολογικών σειρών ημερησίων συναλλαγματικών ισοτιμιών. Η προτεινόμενη μεθοδολογία κατασκευάζει τοπικούς προσέγγιστές, με τη μορφή νευρωνικών δικτύων, για ομάδες προτύπων στο χώρο εισόδων που αναγνωρίζονται από μη-επιβλεπόμενους αλγόριθμους ομαδοποίησης. Στη συνέχεια κατασκευάζονται τεχνικοί κανόνες συναλλαγών απευθείας από τα δεδομένα με τη χρήση γενετικού προγραμματισμού. Η επίδοση των νέων κανόνων συγκρίνεται με αυτή των γενικευμένων κανόνων κινητού μέσου. Το δεύτερο μέρος της διατριβής πραγματεύεται την εφαρμογή εξελικτικών αλγορίθμων για τον υπολογισμό και την εκτίμηση του πλήθους σημείων ισορροπίας σε προβλήματα από τη θεωρία παιγνίων και τη νέα οικονομική γεωγραφία. Πιο συγκεκριμένα, αξιολογείται η ικανότητα των εξελικτικών αλγορίθμων να εντοπίσουν σημεία ισορροπίας κατά Nash σε πεπερασμένα στρατηγικά παίγνια και προτείνονται τεχνικές για τον εντοπισμό περισσοτέρων του ενός σημείων ισορροπίας. Τέλος εφαρμόζονται κριτήρια από τη θεωρία υπολογισμού σταθερών σημείων και τη θεωρία τοπολογικού βαθμού για τη διερεύνηση της ύπαρξης και της υπολογιστικής πολυπλοκότητας του υπολογισμού βραχυχρόνιων σημείων ισορροπίας σε μοντέλα νέας οικονομικής γεωγραφίας. / The thesis investigates Computational Intelligence methods in Economics and Finance. The first part of the thesis is devoted to computational intelligence methods and unsupervised clustering methods for modeling and forecasting daily exchange rate time series. A methodology is proposed that relies on local approximation, using artificial neural networks, for subregions of the input space that are identified through unsupervised clustering algorithms. Furthermore, we employ genetic programming to construct novel trading rules directly from the data. The performance of the novel rules is compared to that of generalised moving average rules. In the second part of the thesis we employ evolutionary algorithms to compute and to estimate the number of equilibria in finite strategic games and new economic geography models. In particular, we investigate the capability of evolutionary and swarm intelligence algorithms to compute Nash equilibria and propose an approach for the computation of more than one equilibria. Finally we employ criteria from the theory on computation of fixed points and topological degree theory to investigate the existence and the computational complexity of computing short run equilibria in new economic geography models.
77

Réseaux de neurones, SVM et approches locales pour la prévision de séries temporelles / No available

Cherif, Aymen 16 July 2013 (has links)
La prévision des séries temporelles est un problème qui est traité depuis de nombreuses années. On y trouve des applications dans différents domaines tels que : la finance, la médecine, le transport, etc. Dans cette thèse, on s’est intéressé aux méthodes issues de l’apprentissage artificiel : les réseaux de neurones et les SVM. On s’est également intéressé à l’intérêt des méta-méthodes pour améliorer les performances des prédicteurs, notamment l’approche locale. Dans une optique de diviser pour régner, les approches locales effectuent le clustering des données avant d’affecter les prédicteurs aux sous ensembles obtenus. Nous présentons une modification dans l’algorithme d’apprentissage des réseaux de neurones récurrents afin de les adapter à cette approche. Nous proposons également deux nouvelles techniques de clustering, la première basée sur les cartes de Kohonen et la seconde sur les arbres binaires. / Time series forecasting is a widely discussed issue for many years. Researchers from various disciplines have addressed it in several application areas : finance, medical, transportation, etc. In this thesis, we focused on machine learning methods : neural networks and SVM. We have also been interested in the meta-methods to push up the predictor performances, and more specifically the local models. In a divide and conquer strategy, the local models perform a clustering over the data sets before different predictors are affected into each obtained subset. We present in this thesis a new algorithm for recurrent neural networks to use them as local predictors. We also propose two novel clustering techniques suitable for local models. The first is based on Kohonen maps, and the second is based on binary trees.
78

Previsão de demanda no médio prazo utilizando redes neurais artificiais em sistemas de distribuição de energia elétrica

Medeiros , Romero Álamo Oliveira de 29 July 2016 (has links)
Submitted by Cristhiane Guerra (cristhiane.guerra@gmail.com) on 2017-01-26T14:55:17Z No. of bitstreams: 1 arquivototal.pdf: 2586746 bytes, checksum: 18b7b08875fbe9dc7bcecd5595b19734 (MD5) / Made available in DSpace on 2017-01-26T14:55:17Z (GMT). No. of bitstreams: 1 arquivototal.pdf: 2586746 bytes, checksum: 18b7b08875fbe9dc7bcecd5595b19734 (MD5) Previous issue date: 2016-07-29 / The demand forecasting studies are of great importance for electricity companies, because there is a need to allocate their resources well in advance, requiring a medium and long- term p lanning. These resources can be the purchase of new equipment, the transmission line acquisition or construction, scheduled maintenance and the purchase and sale of energy. I n this work, a support tool has been developed for experts in strategic planning i n power distribution systems using artificial neural networks to demand forecasting. For the proposed method, it implemented a demand forecasting procedure in the medium term of the region fueled by three substations belonging to the power distribution sys tem managed by EnergisaPB, using a computer model based on Multilayer Perceptron (MLP) artificial neural networks with the assistance of Matlab ® environment. The database was structured by the measurements of active power from 2008 to 2014, provided by En ergisa/PB and the forecast achieved one year ahead (52 weeks) compared with the real data of 2014. In addition, it was possible to evaluate the performance of RNA and estimate the demand growth in the region supplied by each substation, which can assist th e distribution system expansion planning. / Os estudo s de previsão de demanda têm grande importância para empresa s da área de energia elétrica , pois, existe a necessidade de alocar seus recursos com uma certa antecedência , exigindo um planejamento a médio e longo prazo. D entre estes recursos , estão a compra de equipamentos, a aquisição e construção de linhas de transmissão, manutenções programadas e a compra e venda de energia. Nesta premissa, foi desenvolvida uma ferramenta de apoio aos especialistas na área de planejamento estratégico em sistemas de distrib uição de energia elétrica, utilizando redes neurais artificiais para previsão de demanda. Para o método proposto, foi implementado um procedimento de previsão de demanda no médio prazo da região alimentada por três subestações reais pertencentes ao sistema de distribuição de energia gerido pela concessionária Energisa- PB, utilizando um modelo computacional baseado em redes neurais artificiais (RNA) do tipo Multilayer Perceptron (MLP) com o auxílio do ambiente Matlab ® . Foram consideradas as informações reais (banco de dados) da potência ativa, para o período de 2008 até 2014, fornecidas pela própria concessionária e a previsão alcançou o horizonte de um ano a frente (52 semanas). A RNA foi treinada com os dados de 2008 a 2013, e o resultado, comparado com dad os do ano de 2014. Além disso, foi possível avaliar o desempenho da RNA sob diferentes aspectos (volume de treinamento, parâmetros, configurações, camadas ocultas, etc.) e estimar o crescimento de demanda da região alimentada por cada subestação, o que pod e auxiliar o planejamento de expansão do sistema de distribuição.
79

Otimização da função de fitness para a evolução de redes neurais com o uso de análise envoltória de dados aplicada à previsão de séries temporais

SILVA, David Augusto 01 July 2011 (has links)
Submitted by (ana.araujo@ufrpe.br) on 2016-06-28T16:05:18Z No. of bitstreams: 1 David Augusto Silva.pdf: 1453777 bytes, checksum: 4516b869e7e749b770a803eb7e91a084 (MD5) / Made available in DSpace on 2016-06-28T16:05:18Z (GMT). No. of bitstreams: 1 David Augusto Silva.pdf: 1453777 bytes, checksum: 4516b869e7e749b770a803eb7e91a084 (MD5) Previous issue date: 2011-07-01 / The techniques for Time Series Analysis and Forecasting have great presence on the literature over the years. The computational resources combined with statistical techniques are improving the predictive results, and these results have been become increasingly accurate. Computational methods base on Artificial Neural Networks (ANN) and Evolutionary Computing (EC) are presenting a new approach to solve the Time Series Analysis and Forecasting problem. These computational methods are contained in the branch of Artificial Intelligence (AI), and they are biologically inspired, where the ANN models are based on the neural structure of intelligent organism, and the EC uses the concept of nature selection of Charles Darwin. Both methods acquire experience from prior knowledge and example of the given problem. In particular, for the Time Series Forecasting Problem, the objective is to find the predictive model with highest forecast perfomance, where the performance measure are statistical errors. However, there is no universal criterion to identify the best performance measure. Since the ANNs are the predictive models, the EC will constantly evaluate the forecast performance of the ANNs, using a fitness functions to guide the predictive model for an optimal solution. The Data Envelopment Analysis (DEA) was employed to predictive determine the best combination of variables based on the relative efficiency of the best models. Therefore, this work to study the optimization Fitness Function process with Data Envelopment Analysis applied the Intelligence Hybrid System for time series forecasting problem. The data analyzed are composed by financial data series, agribusiness and natural phenomena. The C language program was employed for implementation of the hybrid intelligent system and the R Environment version 2.12 for analysis of DEA models. In general, the perspective of using DEA procedure to evaluate the fitness functions were satisfactory and serves as an additional resource in the branch of time series forecasting. Researchers need to compute the results under different perspectives, whether in the matter of the computational cost of implementing a particular function or which function was more efficient in the aspect of assessing which combinations are unwanted saving time and resources. / As técnicas de análise e previsão de séries temporais alcançaram uma posição de distinção na literatura ao longo dos anos. A utilização de recursos computacionais, combinada com técnicas estatísticas, apresenta resultados mais precisos quando comparados com os recursos separadamente. Em particular, técnicas que usam Redes Neurais Artificiais (RNA) e Computação Evolutiva (CE), apresenta uma posição de destaque na resolução de problemas de previsão na análise de séries temporais. Estas técnicas de Inteligência Artificial (AI) são inspiradas biologicamente, no qual o modelo de RNA é baseado na estrutura neural de organismos inteligentes, que adquirem conhecimento através da experiência. Para o problema de previsão em séries temporais, um fator importante para o maior desempenho na previsão é encontrar um método preditivo com a melhor acurácia possível, tanto quanto possível, no qual o desempenho do método pode ser analisado através de erros de previsão. Entretanto, não existe um critério universal para identificar qual a melhor medida de desempenho a ser utilizada para a caracterização da previsão. Uma vez que as RNAs são os modelos de previsão, a CE constantemente avaliará o desempenho de previsão das RNAs, usando uma função de fitness para guiar o modelo preditivo para uma solução ótima. Desejando verificar quais critérios seriam mais eficientes no momento de escolher o melhor modelo preditivo, a Análise Envoltória de Dados (DEA) é aplicada para fornecer a melhor combinação de variáveis visando a otimização do modelo. Portanto, nesta dissertação, foi estudado o processo de otimização de Funções de Fitness através do uso da Análise Envoltória de Dados utilizando-se de técnicas hibridas de Inteligência Artificial aplicadas a área de previsão de séries temporais. O banco de dados utilizado foi obtido de séries históricas econômico- financeiras, fenômenos naturais e agronegócios obtidos em diferentes órgãos específicos de cada área. Quanto à parte operacional, utilizou-se a linguagem de programação C para implementação do sistema híbrido inteligente e o ambiente R versão 2.12 para a análise dos modelos DEA. Em geral, a perspectiva do uso da DEA para avaliar as Funções de Fitness foi satisfatório e serve como recurso adicional na área de previsão de séries temporais. Cabe ao pesquisador, avaliar os resultados sob diferentes óticas, quer seja sob a questão do custo computacional de implementar uma determinada Função que foi mais eficiente ou sob o aspecto de avaliar quais combinações não são desejadas poupando tempo e recursos.
80

Machine learning strategies for multi-step-ahead time series forecasting

Ben Taieb, Souhaib 08 October 2014 (has links)
How much electricity is going to be consumed for the next 24 hours? What will be the temperature for the next three days? What will be the number of sales of a certain product for the next few months? Answering these questions often requires forecasting several future observations from a given sequence of historical observations, called a time series. <p><p>Historically, time series forecasting has been mainly studied in econometrics and statistics. In the last two decades, machine learning, a field that is concerned with the development of algorithms that can automatically learn from data, has become one of the most active areas of predictive modeling research. This success is largely due to the superior performance of machine learning prediction algorithms in many different applications as diverse as natural language processing, speech recognition and spam detection. However, there has been very little research at the intersection of time series forecasting and machine learning.<p><p>The goal of this dissertation is to narrow this gap by addressing the problem of multi-step-ahead time series forecasting from the perspective of machine learning. To that end, we propose a series of forecasting strategies based on machine learning algorithms.<p><p>Multi-step-ahead forecasts can be produced recursively by iterating a one-step-ahead model, or directly using a specific model for each horizon. As a first contribution, we conduct an in-depth study to compare recursive and direct forecasts generated with different learning algorithms for different data generating processes. More precisely, we decompose the multi-step mean squared forecast errors into the bias and variance components, and analyze their behavior over the forecast horizon for different time series lengths. The results and observations made in this study then guide us for the development of new forecasting strategies.<p><p>In particular, we find that choosing between recursive and direct forecasts is not an easy task since it involves a trade-off between bias and estimation variance that depends on many interacting factors, including the learning model, the underlying data generating process, the time series length and the forecast horizon. As a second contribution, we develop multi-stage forecasting strategies that do not treat the recursive and direct strategies as competitors, but seek to combine their best properties. More precisely, the multi-stage strategies generate recursive linear forecasts, and then adjust these forecasts by modeling the multi-step forecast residuals with direct nonlinear models at each horizon, called rectification models. We propose a first multi-stage strategy, that we called the rectify strategy, which estimates the rectification models using the nearest neighbors model. However, because recursive linear forecasts often need small adjustments with real-world time series, we also consider a second multi-stage strategy, called the boost strategy, that estimates the rectification models using gradient boosting algorithms that use so-called weak learners.<p><p>Generating multi-step forecasts using a different model at each horizon provides a large modeling flexibility. However, selecting these models independently can lead to irregularities in the forecasts that can contribute to increase the forecast variance. The problem is exacerbated with nonlinear machine learning models estimated from short time series. To address this issue, and as a third contribution, we introduce and analyze multi-horizon forecasting strategies that exploit the information contained in other horizons when learning the model for each horizon. In particular, to select the lag order and the hyperparameters of each model, multi-horizon strategies minimize forecast errors over multiple horizons rather than just the horizon of interest.<p><p>We compare all the proposed strategies with both the recursive and direct strategies. We first apply a bias and variance study, then we evaluate the different strategies using real-world time series from two past forecasting competitions. For the rectify strategy, in addition to avoiding the choice between recursive and direct forecasts, the results demonstrate that it has better, or at least has close performance to, the best of the recursive and direct forecasts in different settings. For the multi-horizon strategies, the results emphasize the decrease in variance compared to single-horizon strategies, especially with linear or weakly nonlinear data generating processes. Overall, we found that the accuracy of multi-step-ahead forecasts based on machine learning algorithms can be significantly improved if an appropriate forecasting strategy is used to select the model parameters and to generate the forecasts.<p><p>Lastly, as a fourth contribution, we have participated in the Load Forecasting track of the Global Energy Forecasting Competition 2012. The competition involved a hierarchical load forecasting problem where we were required to backcast and forecast hourly loads for a US utility with twenty geographical zones. Our team, TinTin, ranked fifth out of 105 participating teams, and we have been awarded an IEEE Power & Energy Society award.<p> / Doctorat en sciences, Spécialisation Informatique / info:eu-repo/semantics/nonPublished

Page generated in 0.1113 seconds