Global ETD Search

11	Previsão de series de vazões com redes neurais artificiais e modelos lineares ajustados por algoritmos bio-inspirados / Forecast of seasonal streamflow series with artificial neural networks and linear models adjusted for bio-inspired algorithms Siqueira, Hugo Valadares, 1983- 14 August 2018 (has links) Orientadores: Christiano Lyra Filho, Romis Ribeiro de Faissol Attux / Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de Computação / Made available in DSpace on 2018-08-14T08:57:13Z (GMT). No. of bitstreams: 1 Siqueira_HugoValadares_M.pdf: 4462928 bytes, checksum: 6c158aa0553a6c0912bf75c565974370 (MD5) Previous issue date: 2009 / Resumo: O Sistema Elétrico é um dos pilares do desenvolvimento tecnológico e industrial de uma nação. Dessa forma, é necessário gerir de uma maneira eficiente todos os recursos necessários para obtenção de energia elétrica. Os recursos hídricos se tornam essenciais já que o parque gerador brasileiro é predominantemente hidráulico. Neste contexto, o estudo da previsão de séries de vazões das usinas hidrelétricas tornou-se um campo de pesquisa altamente relevante para o planejamento da geração de energia no Brasil. Os modelos empregados pelo setor elétrico são os chamados modelos de Box & Jenkins, que exige um pré-tratamento dos dados de entrada por conta da sazonalidade encontrada nas vazões ao longo do ano. Este trabalho se utiliza de uma gama de modelos de previsão para comparação de desempenho no problema de previsão de séries de vazões médias mensais, em períodos distintos, da usina hidrelétrica de Furnas. Dentre os modelos lineares, é proposta a utilização de um dos modelos estatísticos, o Auto-regressivo e Médias Móveis (ARMA), tendo seus coeficientes calculados através de algoritmos bioinspirados: algoritmo genético e duas propostas de algoritmos imunológicos, uma baseada em pequenas alterações do CLONALG e a opt-aiNet. Em seguida, um filtro linear realimentado de resposta ao impulso infinita (IIR) tem seus coeficientes calculados pelos algoritmos de otimização acima citados. Na parte dos métodos nãolineares, fez-se a abordagem da aplicação de redes neurais artificiais do tipo perceptron de múltiplas camadas (MLP), com a utilização do algoritmo do gradiente conjugado escalonado modificado para o treinamento. Por fim, uma rede de estados de eco (ESN) é utilizada no problema, com dois algoritmos de treinamento: a proposta de Ozturk et al. E a de Consolaro. Os resultados experimentais mostram a aplicabilidade das ferramentas bioinspiradas e, em muitos casos, a relevância do laço de realimentação. No caso nãolinear, não foi possível obter resultados expressivos para a MLP, enquanto as ESN's mostraram alguns resultados promissores. / Abstract: The Electric System is one of the pillars of technological and industrial development of a nation. Thus, it is necessary to manage in an efficient manner all necessary resources to obtain electrical energy. Water resources become essential since the Brazilian generator park is predominantly hydraulic. In this context, the study of prediction of the streamflow series of hydroelectric dams has become a field of research highly relevant to the planning of energy generation in Brazil. The models used by the electric sector are called models of Box & Jenkins, which requires pre-processing of input data due to the seasonality found in streamflow throughout the year. This work uses a range of forecasting models to compare performance in the problem of monthly averages streamflows series approached, in different periods, the hydroelectric power plant of Furnas. Among the linear models, it is proposed to use one of a statistical model, the autoregressive and moving average (ARMA), taking their coefficients calculated by bio-inspired algorithms: genetic algorithm and two proposed of immunological algorithms, one based on small changes in CLONALG and opt-aiNet. Then, a recurrent linear filter with the infinite impulse response (IIR) has its coefficients calculated by the optimization algorithms above. At the non-linear part, it is the approach of applying artificial neural networks of the type of multi-layer perceptron (MLP), using the algorithm of the modified scaled conjugate gradient for training. Finally, an echo states network is used in the problem, with two training algorithms: the proposal of Ozturk and of Consolaro. The experimental results show the applicability of bio-inspired tools and, in many cases, the importance of the loop of feedback. For the non-linear case, it was not possible to obtain significant results for the MLP, while the ESN's have shown some promising results. / Mestrado / Automação / Mestre em Engenharia Elétrica Redes neurais (Computação) Computação evolutiva Previsão hidrologica Time-series analysis - Data processing Neural networks (Computer science) Evolutionary computation Forecasting hydrological
12	Métrica baseada em projeção de modelos para detecção de danos em estruturas / Metric based on models projection for damage detection in structures Genari, Helói Francisco Gentil, 1985- 20 August 2018 (has links) Orientador: Eurípedes Guilherme de Oliveira Nóbrega / Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Mecânica / Made available in DSpace on 2018-08-20T08:19:57Z (GMT). No. of bitstreams: 1 Genari_HeloiFranciscoGentil_M.pdf: 1362304 bytes, checksum: c4c7788d0595b23a9e9f97011fa6ed8f (MD5) Previous issue date: 2012 / Resumo: Para cumprir requisitos de segurança, aumentar a vida útil de estruturas e reduzir os custos de manutenção, métodos de detecção de danos e de monitoramento da integridade de estruturas (SHM) têm recebido grande atenção da comunidade científica nas últimas décadas. Neste contexto, várias técnicas diferentes para detecção de danos foram propostas, mas algoritmos eficientes e práticos são ainda temas muito pesquisados. Neste trabalho, estudam-se a métrica cepstral e a métrica por subespaços para a detecção de danos. Essas métricas calculam a distância entre dois modelos autorregressivos (AR). A distância entre os modelos AR, derivados a partir das séries temporais dos sinais de vibrações das estruturas com e sem danos utilizando identificação por subespaços, deve ser correlacionada com a informação do dano, incluindo sua severidade e localização. Assim, as distâncias calculadas utilizando-se as métricas são consideradas indicadores de danos. Para validar os dois indicadores, dois experimentos foram realizados. O primeiro consistiu em três vigas similares de alumínio, uma íntegra e duas contendo falhas simuladas que, juntamente com duas massas de 2.5g e 8.5g, simularam quatro danos diferentes. No segundo experimento, foi utilizada uma placa de alumínio retangular e, com o auxílio de massas de 2.5g, 8.5g e 20g, foram simulados cinco danos com diferentes severidades e localizações. Os resultados dos experimentos indicaram que o cálculo das distâncias entre os modelos AR são eficientes para detecção, análise de severidade e localização de danos / Abstract: To satisfy security requirements, extend life cycle of structures and reduce maintenance costs, damage detection techniques and structural health monitoring (SHM) have received great attention from the scientific community in the last decades. In this context, several different techniques for damage detection have been proposed, but efficient and practical algorithms are yet a major research theme. Cepstral metric and subspace metric for damage detection are studied in this work. These metrics compute the distance between two auto-regressive (AR) models, derived from times series of vibration signals from structures with and without damage, and it should be correlated with information of the damage, including damage location and severity. Thus, the distances calculated using these metrics are considered damage indicators. To validate both indicators, two experiments were performed. The first one consisted of three similar beams, a healthy one and two with simulated damages, which, together with two masses of 2.5g e 8.5g, simulated four different damages. In the second experiment, it was used an rectangular aluminum plate with aid of three masses of 2.5g, 8.5g and 20g to simulate five damages with different severities and locations. The results of experiments indicated that the calculation of distances between AR models are effective for the detection, analysis of severity and location of damages / Mestrado / Mecanica dos Sólidos e Projeto Mecanico / Mestre em Engenharia Mecânica Identificação de sistemas Localização de falhas (Engenharia) Análise estrutural (Engenharia) Time series analysis - Data processing Identification and systems Troubleshooting (Engineering) Structural analysis (Engineering)
13	Spectral factor model for time series learning Alexander Miranda, Abhilash 24 November 2011 (has links) Today's computerized processes generate<p>massive amounts of streaming data.<p>In many applications, data is collected for modeling the processes. The process model is hoped to drive objectives such as decision support, data visualization, business intelligence, automation and control, pattern recognition and classification, etc. However, we face significant challenges in data-driven modeling of processes. Apart from the errors, outliers and noise in the data measurements, the main challenge is due to a large dimensionality, which is the number of variables each data sample measures. The samples often form a long temporal sequence called a multivariate time series where any one sample is influenced by the others.<p>We wish to build a model that will ensure robust generation, reviewing, and representation of new multivariate time series that are consistent with the underlying process.<p><p>In this thesis, we adopt a modeling framework to extract characteristics from multivariate time series that correspond to dynamic variation-covariation common to the measured variables across all the samples. Those characteristics of a multivariate time series are named its 'commonalities' and a suitable measure for them is defined. What makes the multivariate time series model versatile is the assumption regarding the existence of a latent time series of known or presumed characteristics and much lower dimensionality than the measured time series; the result is the well-known 'dynamic factor model'.<p>Original variants of existing methods for estimating the dynamic factor model are developed: The estimation is performed using the frequency-domain equivalent of the dynamic factor model named the 'spectral factor model'. To estimate the spectral factor model, ideas are sought from the asymptotic theory of spectral estimates. This theory is used to attain a probabilistic formulation, which provides maximum likelihood estimates for the spectral factor model parameters. Then, maximum likelihood parameters are developed with all the analysis entirely in the spectral-domain such that the dynamically transformed latent time series inherits the commonalities maximally.<p><p>The main contribution of this thesis is a learning framework using the spectral factor model. We term learning as the ability of a computational model of a process to robustly characterize the data the process generates for purposes of pattern matching, classification and prediction. Hence, the spectral factor model could be claimed to have learned a multivariate time series if the latent time series when dynamically transformed extracts the commonalities reliably and maximally. The spectral factor model will be used for mainly two multivariate time series learning applications: First, real-world streaming datasets obtained from various processes are to be classified; in this exercise, human brain magnetoencephalography signals obtained during various cognitive and physical tasks are classified. Second, the commonalities are put to test by asking for reliable prediction of a multivariate time series given its past evolution; share prices in a portfolio are forecasted as part of this challenge.<p><p>For both spectral factor modeling and learning, an analytical solution as well as an iterative solution are developed. While the analytical solution is based on low-rank approximation of the spectral density function, the iterative solution is based on the expectation-maximization algorithm. For the human brain signal classification exercise, a strategy for comparing similarities between the commonalities for various classes of multivariate time series processes is developed. For the share price prediction problem, a vector autoregressive model whose parameters are enriched with the maximum likelihood commonalities is designed. In both these learning problems, the spectral factor model gives commendable performance with respect to competing approaches.<p><p>Les processus informatisés actuels génèrent des quantités massives de flux de données. Dans nombre d'applications, ces flux de données sont collectées en vue de modéliser les processus. Les modèles de processus obtenus ont pour but la réalisation d'objectifs tels que l'aide à la décision, la visualisation de données, l'informatique décisionnelle, l'automatisation et le contrôle, la reconnaissance de formes et la classification, etc. La modélisation de processus sur la base de données implique cependant de faire face à d’importants défis. Outre les erreurs, les données aberrantes et le bruit, le principal défi provient de la large dimensionnalité, i.e. du nombre de variables dans chaque échantillon de données mesurées. Les échantillons forment souvent une longue séquence temporelle appelée série temporelle multivariée, où chaque échantillon est influencé par les autres. Notre objectif est de construire un modèle robuste qui garantisse la génération, la révision et la représentation de nouvelles séries temporelles multivariées cohérentes avec le processus sous-jacent.<p><p>Dans cette thèse, nous adoptons un cadre de modélisation capable d’extraire, à partir de séries temporelles multivariées, des caractéristiques correspondant à des variations - covariations dynamiques communes aux variables mesurées dans tous les échantillons. Ces caractéristiques sont appelées «points communs» et une mesure qui leur est appropriée est définie. Ce qui rend le modèle de séries temporelles multivariées polyvalent est l'hypothèse relative à l'existence de séries temporelles latentes de caractéristiques connues ou présumées et de dimensionnalité beaucoup plus faible que les séries temporelles mesurées; le résultat est le bien connu «modèle factoriel dynamique». Des variantes originales de méthodes existantes pour estimer le modèle factoriel dynamique sont développées :l'estimation est réalisée en utilisant l'équivalent du modèle factoriel dynamique au niveau du domaine de fréquence, désigné comme le «modèle factoriel spectral». Pour estimer le modèle factoriel spectral, nous nous basons sur des idées relatives à la théorie des estimations spectrales. Cette théorie est utilisée pour aboutir à une formulation probabiliste, qui fournit des estimations de probabilité maximale pour les paramètres du modèle factoriel spectral. Des paramètres de probabilité maximale sont alors développés, en plaçant notre analyse entièrement dans le domaine spectral, de façon à ce que les séries temporelles latentes transformées dynamiquement héritent au maximum des points communs.<p><p>La principale contribution de cette thèse consiste en un cadre d'apprentissage utilisant le modèle factoriel spectral. Nous désignons par apprentissage la capacité d'un modèle de processus à caractériser de façon robuste les données générées par le processus à des fins de filtrage par motif, classification et prédiction. Dans ce contexte, le modèle factoriel spectral est considéré comme ayant appris une série temporelle multivariée si la série temporelle latente, une fois dynamiquement transformée, permet d'extraire les points communs de façon fiable et maximale. Le modèle factoriel spectral sera utilisé principalement pour deux applications d'apprentissage de séries multivariées :en premier lieu, des ensembles de données sous forme de flux venant de différents processus du monde réel doivent être classifiés; lors de cet exercice, la classification porte sur des signaux magnétoencéphalographiques obtenus chez l'homme au cours de différentes tâches physiques et cognitives; en second lieu, les points communs obtenus sont testés en demandant une prédiction fiable d'une série temporelle multivariée étant donnée l'évolution passée; les prix d'un portefeuille d'actions sont prédits dans le cadre de ce défi.<p><p>À la fois pour la modélisation et pour l'apprentissage factoriel spectral, une solution analytique aussi bien qu'une solution itérative sont développées. Tandis que la solution analytique est basée sur une approximation de rang inférieur de la fonction de densité spectrale, la solution itérative est basée, quant à elle, sur l'algorithme de maximisation des attentes. Pour l'exercice de classification des signaux magnétoencéphalographiques humains, une stratégie de comparaison des similitudes entre les points communs des différentes classes de processus de séries temporelles multivariées est développée. Pour le problème de prédiction des prix des actions, un modèle vectoriel autorégressif dont les paramètres sont enrichis avec les points communs de probabilité maximale est conçu. Dans ces deux problèmes d’apprentissage, le modèle factoriel spectral atteint des performances louables en regard d’approches concurrentes. / Doctorat en Sciences / info:eu-repo/semantics/nonPublished Informatique générale Sciences exactes et naturelles Time-series analysis -- Data processing Multivariate analysis -- Data processing Série chronologique -- Informatique Analyse multivariée -- Informatique Time Series Analysis Machine Learning Spectral Factor Model
14	Application of Distance Covariance to Time Series Modeling and Assessing Goodness-of-Fit Fernandes, Leon January 2024 (has links) The overarching goal of this thesis is to use distance covariance based methods to extend asymptotic results from the i.i.d. case to general time series settings. Accounting for dependence may make already difficult statistical inference all the more challenging. The distance covariance is an increasingly popular measure of dependence between random vectors that goes beyond linear dependence as described by correlation. It is defined by a squared integral norm of the difference between the joint and marginal characteristic functions with respect to a specific weight function. Distance covariance has the advantage of being able to detect dependence even for uncorrelated data. The energy distance is a closely related quantity that measures distance between distributions of random vectors. These statistics can be used to establish asymptotic limit theory for stationary ergodic time series. The asymptotic results are driven by the limit theory for the empirical characteristic functions. In this thesis we apply the distance covariance to three problems in time series modeling: (i) Independent Component Analysis (ICA), (ii) multivariate time series clustering, and (iii) goodness-of-fit using residuals from a fitted model. The underlying statistical procedures for each topic uses the distance covariance function as a measure of dependence. The distance covariance arises in various ways in each of these topics; one as a measure of independence among the components of a vector, second as a measure of similarity of joint distributions and, third for assessing serial dependence among the fitted residuals. In each of these cases, limit theory is established for the corresponding empirical distance covariance statistics when the data comes from a stationary ergodic time series. For Topic (i) we consider an ICA framework, which is a popular tool used for blind source separation and has found application in fields such as financial time series, signal processing, feature extraction, and brain imaging. The Structural Vector Autogregression (SVAR) model is often the basic model used for modeling macro time series. The residuals in such a model are given by e_t = A S_t, the classical ICA model. In certain applications, one of the components of S_t has infinite variance. This differs from the standard ICA model. Furthermore the e_t's are not observed directly but are only estimated from the SVAR modeling. Many of the ICA procedures require the existence of a finite second or even fourth moment. We derive consistency when using the distance covariance for measuring independence of residuals under the infinite variance case.Extensions to the ICA model with noise, which has a direct application to SVAR models when testing independence of residuals based on their estimated counterparts is also considered. In Topic (ii) we propose a novel methodology for clustering multivariate time series data using energy distance. Specifically, a dissimilarity matrix is formed using the energy distance statistic to measure separation between the finite dimensional distributions for the component time series. Once the pairwise dissimilarity matrix is calculated, a hierarchical clustering method is then applied to obtain the dendrogram. This procedure is completely nonparametric as the dissimilarities between stationary distributions are directly calculated without making any model assumptions. In order to justify this procedure, asymptotic properties of the energy distance estimates are derived for general stationary and ergodic time series. Topic (iii) considers the fundamental and often final step in time series modeling, assessing the quality of fit of a proposed model to the data. Since the underlying distribution of the innovations that generate a model is often not prescribed, goodness-of-fit tests typically take the form of testing the fitted residuals for serial independence. However, these fitted residuals are inherently dependent since they are based on the same parameter estimates and thus standard tests of serial independence, such as those based on the autocorrelation function (ACF) or distance correlation function (ADCF) of the fitted residuals need to be adjusted. We apply sample splitting in the time series setting to perform tests of serial dependence of fitted residuals using the sample ACF and ADCF. Here the first f_n of the n data points in the time series are used to estimate the parameters of the model. Tests for serial independence are then based on all the n residuals. With f_n = n/2 the ACF and ADCF tests of serial independence tests often have the same limit distributions as though the underlying residuals are indeed i.i.d. That is, if the first half of the data is used to estimate the parameters and the estimated residuals are computed for the entire data set based on these parameter estimates, then the ACF and ADCF can have the same limit distributions as though the residuals were i.i.d. This procedure ameliorates the need for adjustment in the construction of confidence bounds for both the ACF and ADCF, based on the fitted residuals, in goodness-of-fit testing. We also show that if f_n < n/2 then the asymptotic distribution of the tests stochastically dominate the corresponding asymptotic distributions for the true i.i.d. noise; the stochastic order gets reversed under f_n > n/2. Statistics Time-series analysis--Data processing Stochastic analysis Goodness-of-fit tests
15	Machine learning strategies for multi-step-ahead time series forecasting Ben Taieb, Souhaib 08 October 2014 (has links) How much electricity is going to be consumed for the next 24 hours? What will be the temperature for the next three days? What will be the number of sales of a certain product for the next few months? Answering these questions often requires forecasting several future observations from a given sequence of historical observations, called a time series. <p><p>Historically, time series forecasting has been mainly studied in econometrics and statistics. In the last two decades, machine learning, a field that is concerned with the development of algorithms that can automatically learn from data, has become one of the most active areas of predictive modeling research. This success is largely due to the superior performance of machine learning prediction algorithms in many different applications as diverse as natural language processing, speech recognition and spam detection. However, there has been very little research at the intersection of time series forecasting and machine learning.<p><p>The goal of this dissertation is to narrow this gap by addressing the problem of multi-step-ahead time series forecasting from the perspective of machine learning. To that end, we propose a series of forecasting strategies based on machine learning algorithms.<p><p>Multi-step-ahead forecasts can be produced recursively by iterating a one-step-ahead model, or directly using a specific model for each horizon. As a first contribution, we conduct an in-depth study to compare recursive and direct forecasts generated with different learning algorithms for different data generating processes. More precisely, we decompose the multi-step mean squared forecast errors into the bias and variance components, and analyze their behavior over the forecast horizon for different time series lengths. The results and observations made in this study then guide us for the development of new forecasting strategies.<p><p>In particular, we find that choosing between recursive and direct forecasts is not an easy task since it involves a trade-off between bias and estimation variance that depends on many interacting factors, including the learning model, the underlying data generating process, the time series length and the forecast horizon. As a second contribution, we develop multi-stage forecasting strategies that do not treat the recursive and direct strategies as competitors, but seek to combine their best properties. More precisely, the multi-stage strategies generate recursive linear forecasts, and then adjust these forecasts by modeling the multi-step forecast residuals with direct nonlinear models at each horizon, called rectification models. We propose a first multi-stage strategy, that we called the rectify strategy, which estimates the rectification models using the nearest neighbors model. However, because recursive linear forecasts often need small adjustments with real-world time series, we also consider a second multi-stage strategy, called the boost strategy, that estimates the rectification models using gradient boosting algorithms that use so-called weak learners.<p><p>Generating multi-step forecasts using a different model at each horizon provides a large modeling flexibility. However, selecting these models independently can lead to irregularities in the forecasts that can contribute to increase the forecast variance. The problem is exacerbated with nonlinear machine learning models estimated from short time series. To address this issue, and as a third contribution, we introduce and analyze multi-horizon forecasting strategies that exploit the information contained in other horizons when learning the model for each horizon. In particular, to select the lag order and the hyperparameters of each model, multi-horizon strategies minimize forecast errors over multiple horizons rather than just the horizon of interest.<p><p>We compare all the proposed strategies with both the recursive and direct strategies. We first apply a bias and variance study, then we evaluate the different strategies using real-world time series from two past forecasting competitions. For the rectify strategy, in addition to avoiding the choice between recursive and direct forecasts, the results demonstrate that it has better, or at least has close performance to, the best of the recursive and direct forecasts in different settings. For the multi-horizon strategies, the results emphasize the decrease in variance compared to single-horizon strategies, especially with linear or weakly nonlinear data generating processes. Overall, we found that the accuracy of multi-step-ahead forecasts based on machine learning algorithms can be significantly improved if an appropriate forecasting strategy is used to select the model parameters and to generate the forecasts.<p><p>Lastly, as a fourth contribution, we have participated in the Load Forecasting track of the Global Energy Forecasting Competition 2012. The competition involved a hierarchical load forecasting problem where we were required to backcast and forecast hourly loads for a US utility with twenty geographical zones. Our team, TinTin, ranked fifth out of 105 participating teams, and we have been awarded an IEEE Power & Energy Society award.<p> / Doctorat en sciences, Spécialisation Informatique / info:eu-repo/semantics/nonPublished Informatique générale Sciences exactes et naturelles Machine learning Time-series analysis -- Data processing Apprentissage automatique Série chronologique -- Informatique forecasting competitions load forecasting nearest neighbors neural networks gradient boosting direct forecasts forecasting strategies recursive forecasts machine learning time series forecasting
16	Exploring advanced forecasting methods with applications in aviation Riba, Evans Mogolo 02 1900 (has links) Abstracts in English, Afrikaans and Northern Sotho / More time series forecasting methods were researched and made available in recent years. This is mainly due to the emergence of machine learning methods which also found applicability in time series forecasting. The emergence of a variety of methods and their variants presents a challenge when choosing appropriate forecasting methods. This study explored the performance of four advanced forecasting methods: autoregressive integrated moving averages (ARIMA); artificial neural networks (ANN); support vector machines (SVM) and regression models with ARIMA errors. To improve their performance, bagging was also applied. The performance of the different methods was illustrated using South African air passenger data collected for planning purposes by the Airports Company South Africa (ACSA). The dissertation discussed the different forecasting methods at length. Characteristics such as strengths and weaknesses and the applicability of the methods were explored. Some of the most popular forecast accuracy measures were discussed in order to understand how they could be used in the performance evaluation of the methods. It was found that the regression model with ARIMA errors outperformed all the other methods, followed by the ARIMA model. These findings are in line with the general findings in the literature. The ANN method is prone to overfitting and this was evident from the results of the training and the test data sets. The bagged models showed mixed results with marginal improvement on some of the methods for some performance measures. It could be concluded that the traditional statistical forecasting methods (ARIMA and the regression model with ARIMA errors) performed better than the machine learning methods (ANN and SVM) on this data set, based on the measures of accuracy used. This calls for more research regarding the applicability of the machine learning methods to time series forecasting which will assist in understanding and improving their performance against the traditional statistical methods / Die afgelope tyd is verskeie tydreeksvooruitskattingsmetodes ondersoek as gevolg van die ontwikkeling van masjienleermetodes met toepassings in die vooruitskatting van tydreekse. Die nuwe metodes en hulle variante laat ŉ groot keuse tussen vooruitskattingsmetodes. Hierdie studie ondersoek die werkverrigting van vier gevorderde vooruitskattingsmetodes: outoregressiewe, geïntegreerde bewegende gemiddeldes (ARIMA), kunsmatige neurale netwerke (ANN), steunvektormasjiene (SVM) en regressiemodelle met ARIMA-foute. Skoenlussaamvoeging is gebruik om die prestasie van die metodes te verbeter. Die prestasie van die vier metodes is vergelyk deur hulle toe te pas op Suid-Afrikaanse lugpassasiersdata wat deur die Suid-Afrikaanse Lughawensmaatskappy (ACSA) vir beplanning ingesamel is. Hierdie verhandeling beskryf die verskillende vooruitskattingsmetodes omvattend. Sowel die positiewe as die negatiewe eienskappe en die toepasbaarheid van die metodes is uitgelig. Bekende prestasiemaatstawwe is ondersoek om die prestasie van die metodes te evalueer. Die regressiemodel met ARIMA-foute en die ARIMA-model het die beste van die vier metodes gevaar. Hierdie bevinding strook met dié in die literatuur. Dat die ANN-metode na oormatige passing neig, is deur die resultate van die opleidings- en toetsdatastelle bevestig. Die skoenlussamevoegingsmodelle het gemengde resultate opgelewer en in sommige prestasiemaatstawwe vir party metodes marginaal verbeter. Op grond van die waardes van die prestasiemaatstawwe wat in hierdie studie gebruik is, kan die gevolgtrekking gemaak word dat die tradisionele statistiese vooruitskattingsmetodes (ARIMA en regressie met ARIMA-foute) op die gekose datastel beter as die masjienleermetodes (ANN en SVM) presteer het. Dit dui op die behoefte aan verdere navorsing oor die toepaslikheid van tydreeksvooruitskatting met masjienleermetodes om hul prestasie vergeleke met dié van die tradisionele metodes te verbeter. / Go nyakišišitšwe ka ga mekgwa ye mentši ya go akanya ka ga molokoloko wa dinako le go dirwa gore e hwetšagale mo mengwageng ye e sa tšwago go feta. Se k e k a le b a k a la g o t šwelela ga mekgwa ya go ithuta ya go diriša metšhene yeo le yona e ilego ya dirišwa ka kakanyong ya molokolokong wa dinako. Go t šwelela ga mehutahuta ya mekgwa le go fapafapana ga yona go tšweletša tlhohlo ge go kgethwa mekgwa ya maleba ya go akanya. Dinyakišišo tše di lekodišišitše go šoma ga mekgwa ye mene ya go akanya yeo e gatetšego pele e lego: ditekanyotshepelo tšeo di kopantšwego tša poelomorago ya maitirišo (ARIMA); dinetweke tša maitirelo tša nyurale (ANN); metšhene ya bekthara ya thekgo (SVM); le mekgwa ya poelomorago yeo e nago le diphošo tša ARIMA. Go kaonafatša go šoma ga yona, nepagalo ya go ithuta ka metšhene le yona e dirišitšwe. Go šoma ga mekgwa ye e fepafapanego go laeditšwe ka go šomiša tshedimošo ya banamedi ba difofane ba Afrika Borwa yeo e kgobokeditšwego mabakeng a dipeakanyo ke Khamphani ya Maemafofane ya Afrika Borwa (ACSA). Sengwalwanyaki šišo se ahlaahlile mekgwa ya kakanyo ye e fapafapanego ka bophara. Dipharologanyi tša go swana le maatla le bofokodi le go dirišega ga mekgwa di ile tša šomišwa. Magato a mangwe ao a tumilego kudu a kakanyo ye e nepagetšego a ile a ahlaahlwa ka nepo ya go kwešiša ka fao a ka šomišwago ka gona ka tshekatshekong ya go šoma ga mekgwa ye. Go hweditšwe gore mokgwa wa poelomorago wa go ba le diphošo tša ARIMA o phadile mekgwa ye mengwe ka moka, gwa latela mokgwa wa ARIMA. Dikutollo tše di sepelelana le dikutollo ka kakaretšo ka dingwaleng. Mo k gwa wa ANN o ka fela o fetišiša gomme se se bonagetše go dipoelo tša tlhahlo le dihlo pha t ša teko ya tshedimošo. Mekgwa ya nepagalo ya go ithuta ka metšhene e bontšhitše dipoelo tšeo di hlakantšwego tšeo di nago le kaonafalo ye kgolo go ye mengwe mekgwa ya go ela go phethagatšwa ga mešomo. Go ka phethwa ka gore mekgwa ya setlwaedi ya go akanya dipalopalo (ARIMA le mokgwa wa poelomorago wa go ba le diphošo tša ARIMA) e šomile bokaone go phala mekgwa ya go ithuta ka metšhene (ANN le SVM) ka mo go sehlopha se sa tshedimošo, go eya ka magato a nepagalo ya magato ao a šomišitšwego. Se se nyaka gore go dirwe dinyakišišo tše dingwe mabapi le go dirišega ga mekgwa ya go ithuta ka metšhene mabapi le go akanya molokoloko wa dinako, e lego seo se tlago thuša go kwešiša le go kaonafatša go šoma ga yona kgahlanong le mekgwa ya setlwaedi ya dipalopalo. / Decision Sciences / M. Sc. (Operations Research) Time series forecasting Regression model with ARIMA errors ARIMA Artificial neural networks ANN Support vector machines SVM Bagging Bootstrap aggregating Air passengers 387.70151955 Time-series analysis -- Data processing

Page generated in 0.1575 seconds