Global ETD Search

31	Posouzení vybraných ukazatelů mezinárodní společnosti pomocí statistických metod / Assessment of Selected Indicators of an International Company Using Statistical Methods Zouharová, Daniela January 2021 (has links) The thesis deals with the assessment of the economic situation of a selected company using statistical methods. The theoretical part describes the financial indicators, time series and regression analysis. The analytical part contains the calculation of selected indicators, which are subjected to statistical analysis. Based on the statistical analysis, the future development of indicators in the next two years is predicted. The last part contains suggestions that can lead to the improvement of the current situation of the company.
32	Analýza obchodních dat využitím metod rozpoznání vzoru / Analysis of business data using methods of pattern recognition Prišť, Lukáš January 2015 (has links) This project explores basic methods of time series analysis and decomposition of these series using the additive model. It describes creation of classes for generating and decomposition of time series in Python. This project also guides the reader through creation of Matlab user interface which is used to generate time series and mark chosen parameters. I also go through implementation of functions for time series decomposition previously created in Python. I chose seven parameters of which I kept track. I also chose general features for representing chosen parameters as well as features which were chosen carefully for each parameter. Every time series generated by this user interface are then used to train a program, which classifies them for semantic description. After training the created model was used to predict chosen parameters of previously unknown time series.
33	Using Twitter Attribute Information to Predict Stock Prices Karlemstrand, Roderick, Leckström, Ebba January 2021 (has links) Being able to predict stock prices might be the unspoken wish of stock investors. Although stock prices are complicated to predict, there are many theories about what affects their movements, including interest rates, news and social media. With the help of Machine Learning, complex patterns in data can be identified beyond the human intellect. In this thesis, a Machine Learning model for time series forecasting is created and tested to predict stock prices. The model is based on a neural network with several layers of Long Short-Term Memory (LSTM) and fully connected layers. It is trained with historical stock values, technical indicators and Twitter attribute information retrieved, extracted and calculated from posts on the social media platform Twitter. These attributes are sentiment score, favourites, followers, retweets and if an account is verified. To collect data from Twitter, Twitter’s API is used. Sentiment analysis is conducted with Valence Aware Dictionary and sEntiment Reasoner (VADER). The results show that by adding more Twitter attributes, the Mean Squared Error (MSE) between the predicted prices and the actual prices improved by 3%. With technical analysis taken into account, MSE decreases from 0.1617 to 0.1437, which is an improvement of around 11%. The restrictions of this study include that the selected stock has to be publicly listed on the stock market and popular on Twitter and among individual investors. Besides, the stock markets’ opening hours differ from Twitter, which constantly available. It may therefore introduce noises in the model. / Att kunna förutspå aktiekurser kan sägas vara aktiespararnas outtalade önskan. Även om aktievärden är komplicerade att förutspå finns det många teorier om vad som påverkar dess rörelser, bland annat räntor, nyheter och sociala medier. Med hjälp av maskininlärning kan mönster i data identifieras bortom människans intellekt. I detta examensarbete skapas och testas en modell inom maskininlärning i syfte att beräkna framtida aktiepriser. Modellen baseras på ett neuralt nätverk med flera lager av LSTM och fullt kopplade lager. Den tränas med historiska aktievärden, tekniska indikatorer och Twitter-attributinformation. De är hämtad, extraherad och beräknad från inlägg på den sociala plattformen Twitter. Dessa attribut är sentiment-värde, antal favorit-markeringar, följare, retweets och om kontot är verifierat. För att samla in data från Twitter används Twitters API och sentimentanalys genomförs genom VADER. Resultatet visar att genom att lägga till fler Twitter attribut förbättrade MSE mellan de förutspådda värdena och de faktiska värdena med 3%. Genom att ta teknisk analys i beaktande minskar MSE från 0,1617 till 0,1437, vilket är en förbättring på 11%. Begränsningar i denna studie innefattar bland annat att den utvalda aktien ska vara publikt listad på börsen och populär på Twitter och bland småspararna. Dessutom skiljer sig aktiemarknadens öppettider från Twitter då den är ständigt tillgänglig. Detta kan då introducera brus i modellen. Stock price prediction Machine Learning Deep Learning Time series prediction Twitter Twitter attributes Computer and Information Sciences Data- och informationsvetenskap
34	Optimization-Based Solutions for Planning and Control / Optimization-based Solutions to Optimal Operation under Uncertainty and Disturbance Rejection Jalanko, Mahir January 2021 (has links) Industrial automation systems normally consist of four different hierarchy levels: planning, scheduling, real-time optimization, and control. At the planning level, the goal is to compute an optimal production plan that minimizes the production cost while meeting process constraints. The planning model is typically formulated as a mixed integer nonlinear programming (MINLP), which is hard to solve to global optimality due to nonconvexity and large dimensionality attributes. Uncertainty in component qualities in gasoline blending due to measurement errors and variation in upstream processes may lead to off-specification products which require re-blending. Uncertainty in product demands may lead to a suboptimal solution and fail in capturing some potential profit due to shortage in products supply. While incorporating process uncertainties is essential to reducing the production cost and increasing profitability, it comes with the disadvantage of increasing the complexity of the MINLP planning model. The key contribution in the planning level is to employ the inventory pinch decomposition method to consider uncertainty in components qualities and products demands to reduce the production cost and increase profitability of the gasoline blend application. At the control level, the goal is to ensure desired operation conditions by meeting process setpoints, ensure process safety, and avoid process failures. Model predictive control (MPC) is an advanced control strategy that utilizes a dynamic model of the process to predict future process dynamic behavior over a time horizon. The effectiveness of the MPC relies heavily on the availability of a reasonably accurate process model. The key contributions in the control level are: (1) investigate the use of different system identification methods for the purpose of developing a dynamic model for high-purity distillation column, which is a highly nonlinear process. (2) Develop a novel hybrid based MPC to improve the control of the column and achieve flooding-free control. / Dissertation / Doctor of Philosophy (PhD) / The operation of a chemical process involves many decisions which are normally distributed into levels referred to as process automation hierarchy. The process automation hierarchy levels are planning, scheduling, real-time optimization, and control. This thesis addresses two of the levels in the process automation hierarchy, which are planning and control. At the planning level, the objective is to ensure optimal utilization of raw materials and equipment to reduce production cost. At the control level, the objective is to meet and follow process setpoints determined by the real-time optimization level. The main goals of the thesis are: (1) develop an efficient algorithm to solve a large-scale planning problem that incorporates uncertainties in components qualities and products demands to reduce the production cost and maximize profit for gasoline blending application. (2) Develop a novel hybrid-based model predictive control to improve the control strategy of an industrial distillation column that faces flooding issues. Production planning under uncertainty system identification artificial neural network time series prediction Hybrid model Distillation column flooding control
35	Gene Network Inference and Expression Prediction Using Recurrent Neural Networks and Evolutionary Algorithms Chan, Heather Y. 10 December 2010 (has links) (PDF) We demonstrate the success of recurrent neural networks in gene network inference and expression prediction using a hybrid of particle swarm optimization and differential evolution to overcome the classic obstacle of local minima in training recurrent neural networks. We also provide an improved validation framework for the evaluation of genetic network modeling systems that will result in better generalization and long-term prediction capability. Success in the modeling of gene regulation and prediction of gene expression will lead to more rapid discovery and development of therapeutic medicine, earlier diagnosis and treatment of adverse conditions, and vast advancements in life science research. genetic network modeling gene network inference gene expression prediction recurrent networks evolutionary algorithms time series prediction Computer Sciences
36	Application of probabilistic deep learning models to simulate thermal power plant processes Raidoo, Renita Anand 18 April 2023 (has links) (PDF) Deep learning has gained traction in thermal engineering due to its applications to process simulations, the deeper insights it can provide and its abilities to circumvent the shortcomings of classic thermodynamic simulation approaches by capturing complex inter-dependencies. This works sets out to apply probabilistic deep learning to power plant operations using historic plant data. The first study presented, entails the development of a steady-state mixture density network (MDN) capable of predicting effective heat transfer coefficients (HTC) for the various heat exchanger components inside a utility scale boiler. Selected directly controllable input features, including the excess air ratio, steam temperatures, flow rates and pressures are used to predict the HTCs. In the second case study, an encoder-decoder mixturedensity network (MDN) is developed using recurrent neural networks (RNN) for the prediction of utility-scale air-cooled condenser (ACC) backpressure. The effects of ambient conditions and plant operating parameters, such as extraction flow rate, on ACC performance is investigated. In both case studies, hyperparameter searches are done to determine the best performing architectures for these models. Comparisons are drawn between the MDN model versus standard model architecture in both case studies. The HTC predictor model achieved 90% accuracy which equates to an average error of 4.89 W m2K across all heat exchangers. The resultant time-series ACC model achieved an average error of 3.14 kPa, which translate into a model accuracy of 82%. Air-cooled condensers Deep learning Mixture density networks Recurrent neural networks
37	Blood Glucose Level Prediction via Seamless Incorporation of Raw Features Using RNNs Mirshekarianbabaki, Sadegh 03 July 2018 (has links) No description available. Computer Science Blood Glucose Level Prediction Time Series Prediction LSTM RNN Neural Networks Physiological Models SVR Diabetes
38	Predição de séries temporais por similaridade / Similarity-based time series prediction Parmezan, Antonio Rafael Sabino 07 April 2016 (has links) Um dos maiores desafios em Mineração de Dados é a integração da informação temporal ao seu processo. Esse fato tem desafiado profissionais de diferentes domínios de aplicação e recebido investimentos consideráveis da comunidade científica e empresarial. No contexto de predição de Séries Temporais, os investimentos se concentram no subsídio de pesquisas destinadas à adaptação dos métodos convencionais de Aprendizado de Máquina para a análise de dados na qual o tempo constitui um fator importante. À vista disso, neste trabalho é proposta uma nova extensão do algoritmo de Aprendizado de Máquina k-Nearest Neighbors (kNN) para predição de Séries Temporais, intitulado de kNN - Time Series Prediction with Invariances (kNN-TSPI ). O algoritmo concebido difere da versão convencional pela incorporação de três técnicas para obtenção de invariância à amplitude e deslocamento, invariância à complexidade e tratamento de casamentos triviais. Como demonstrado ao longo desta dissertação de mestrado, o uso simultâneo dessas técnicas proporciona ao kNN-TSPI uma melhor correspondência entre as subsequências de dados e a consulta de referência. Os resultados de uma das avaliações empíricas mais extensas, imparciais e compreensíveis já conduzidas no tema de predição de Séries Temporais evidenciaram, a partir do confronto de dez métodos de projeção, que o algoritmo kNN-TSPI, além de ser conveniente para a predição automática de dados a curto prazo, é competitivo com os métodos estatísticos estado-da-arte ARIMA e SARIMA. Por mais que o modelo SARIMA tenha atingido uma precisão relativamente superior a do método baseado em similaridade, o kNN-TSPI é consideravelmente mais simples de ajustar. A comparação objetiva e subjetiva entre algoritmos estatísticos e de Aprendizado de Máquina para a projeção de dados temporais vem a suprir uma importante lacuna na literatura, a qual foi identificada por meio de uma revisão sistemática seguida de uma meta-análise das publicações selecionadas. Os 95 conjuntos de dados empregados nos experimentos computacionais juntamente com todas as projeções analisadas em termos de Erro Quadrático Médio, coeficiente U de Theil e taxa de acerto Prediction Of Change In Direction encontram-se disponíveis no portal Web ICMC-USP Time Series Prediction Repository. A presente pesquisa abrange também contribuições e resultados significativos em relação às propriedades inerentes à predição baseada em similaridade, sobretudo do ponto de vista prático. Os protocolos experimentais delineados e as diversas conclusões obtidas poderão ser usados como referência para guiar o processo de escolha de modelos, configuração de parâmetros e aplicação dos algoritmos de Inteligência Artificial para predição de Séries Temporais. / One of the major challenges in Data Mining is integrating temporal information into process. This difficulty has challenged professionals several application fields and has been object of considerable investment from scientific and business communities. In the context of Time Series prediction, these investments consist majority of grants for designed research aimed at adapting conventional Machine Learning methods for data analysis problems in which time is an important factor. We propose a novel modification of the k-Nearest Neighbors (kNN) learning algorithm for Time Series prediction, namely the kNN - Time Series Prediction with Invariances (kNN-TSPI). Our proposal differs from the literature by incorporating techniques for amplitude and offset invariance, complexity invariance, and treatment of trivial matches. These three modifications allow more meaningful matching between the reference queries and Time Series subsequences, as we discuss with more details throughout this masters thesis. We have performed one of the most comprehensible empirical evaluations of Time Series prediction, in which we faced the proposed algorithm with ten methods commonly found in literature. The results show that the kNN-TSPI is appropriate for automated short-term projection and is competitive with the state-of-the-art statistical methods ARIMA and SARIMA. Although in our experiments the SARIMA model has reached a slightly higher precision than the similarity based method, the kNN-TSPI is considerably simpler to adjust. The objective and subjective comparisons of statistical and Machine Learning algorithms for temporal data projection fills a major gap in the literature, which was identified through a systematic review followed by a meta-analysis of selected publications. The 95 data sets used in our computational experiments, as well all the projections with respect to Mean Squared Error, Theils U coefficient and hit rate Prediction Of Change In Direction are available online at the ICMC-USP Time Series Prediction Repository. This work also includes contributions and significant results with respect to the properties inherent to similarity-based prediction, especially from the practical point of view. The outlined experimental protocols and our discussion on the usage of them, can be used as a guideline for models selection, parameters setting, and employment of Artificial Intelligence algorithms for Time Series prediction. Aprendizado de máquina Data mining Machine learning Métodos baseados em similaridade Mineração de dados Predição de séries temporais Similarity-based methods Time series prediction
39	Algoritmo kNN para previsão de dados temporais: funções de previsão e critérios de seleção de vizinhos próximos aplicados a variáveis ambientais em limnologia / Time series prediction using a KNN-based algorithm prediction functions and nearest neighbor selection criteria applied to limnological data Ferrero, Carlos Andres 04 March 2009 (has links) A análise de dados contendo informações sequenciais é um problema de crescente interesse devido à grande quantidade de informação que é gerada, entre outros, em processos de monitoramento. As séries temporais são um dos tipos mais comuns de dados sequenciais e consistem em observações ao longo do tempo. O algoritmo k-Nearest Neighbor - Time Series Prediction kNN-TSP é um método de previsão de dados temporais. A principal vantagem do algoritmo é a sua simplicidade, e a sua aplicabilidade na análise de séries temporais não-lineares e na previsão de comportamentos sazonais. Entretanto, ainda que ele frequentemente encontre as melhores previsões para séries temporais parcialmente periódicas, várias questões relacionadas com a determinação de seus parâmetros continuam em aberto. Este trabalho, foca-se em dois desses parâmetros, relacionados com a seleção de vizinhos mais próximos e a função de previsão. Para isso, é proposta uma abordagem simples para selecionar vizinhos mais próximos que considera a similaridade e a distância temporal de modo a selecionar os padrões mais similares e mais recentes. Também é proposta uma função de previsão que tem a propriedade de manter bom desempenho na presença de padrões em níveis diferentes da série temporal. Esses parâmetros foram avaliados empiricamente utilizando várias séries temporais, inclusive caóticas, bem como séries temporais reais referentes a variáveis ambientais do reservatório de Itaipu, disponibilizadas pela Itaipu Binacional. Três variáveis limnológicas fortemente correlacionadas são consideradas nos experimentos de previsão: temperatura da água, temperatura do ar e oxigênio dissolvido. Uma análise de correlação é realizada para verificar se os dados previstos mantem a correlação das variáveis. Os resultados mostram que, o critério de seleção de vizinhos próximos e a função de previsão, propostos neste trabalho, são promissores / Treating data that contains sequential information is an important problem that arises during the data mining process. Time series constitute a popular class of sequential data, where records are indexed by time. The k-Nearest Neighbor - Time Series Prediction kNN-TSP method is an approximator for time series prediction problems. The main advantage of this approximator is its simplicity, and is often used in nonlinear time series analysis for prediction of seasonal time series. Although kNN-TSP often finds the best fit for nearly periodic time series forecasting, some problems related to how to determine its parameters still remain. In this work, we focus in two of these parameters: the determination of the nearest neighbours and the prediction function. To this end, we propose a simple approach to select the nearest neighbours, where time is indirectly taken into account by the similarity measure, and a prediction function which is not disturbed in the presence of patterns at different levels of the time series. Both parameters were empirically evaluated on several artificial time series, including chaotic time series, as well as on a real time series related to several environmental variables from the Itaipu reservoir, made available by Itaipu Binacional. Three of the most correlated limnological variables were considered in the experiments carried out on the real time series: water temperature, air temperature and dissolved oxygen. Analyses of correlation were also accomplished to verify if the predicted variables values maintain similar correlation as the original ones. Results show that both proposals, the one related to the determination of the nearest neighbours as well as the one related to the prediction function, are promising Aprendizado de máquina Dados ambientais Environmental data Funções de previsão Limnologia Limnology Machine learning Nearest neighbor selection Prediction functions Previsão de dados temporais Seleção de vizinhos próximos Time series prediction
40	Prédiction des séries temporelles larges / Prediction of large time series Hmamouche, Youssef 13 December 2018 (has links) De nos jours, les systèmes modernes sont censés stocker et traiter des séries temporelles massives. Comme le nombre de variables observées augmente très rapidement, leur prédiction devient de plus en plus compliquée, et l’utilisation de toutes les variables pose des problèmes pour les modèles classiques.Les modèles de prédiction sans facteurs externes sont parmi les premiers modèles de prédiction. En vue d’améliorer la précision des prédictions, l’utilisation de multiples variables est devenue commune. Ainsi, les modèles qui tiennent en compte des facteurs externes, ou bien les modèles multivariés, apparaissent, et deviennent de plus en plus utilisés car ils prennent en compte plus d’informations.Avec l’augmentation des données liées entre eux, l’application des modèles multivariés devient aussi discutable. Le challenge dans cette situation est de trouver les facteurs les plus pertinents parmi l’ensemble des données disponibles par rapport à une variable cible.Dans cette thèse, nous étudions ce problème en présentant une analyse détaillée des approches proposées dans la littérature. Nous abordons le problème de réduction et de prédiction des données massives. Nous discutons également ces approches dans le contexte du Big Data.Ensuite, nous présentons une méthodologie complète pour la prédiction des séries temporelles larges. Nous étendons également cette méthodologie aux données très larges via le calcul distribué et le parallélisme avec une implémentation du processus de prédiction proposé dans l’environnement Hadoop/Spark. / Nowadays, storage and data processing systems are supposed to store and process large time series. As the number of variables observed increases very rapidly, their prediction becomes more and more complicated, and the use of all the variables poses problems for classical prediction models.Univariate prediction models are among the first models of prediction. To improve these models, the use of multiple variables has become common. Thus, multivariate models and become more and more used because they consider more information.With the increase of data related to each other, the application of multivariate models is also questionable. Because the use of all existing information does not necessarily lead to the best predictions. Therefore, the challenge in this situation is to find the most relevant factors among all available data relative to a target variable.In this thesis, we study this problem by presenting a detailed analysis of the proposed approaches in the literature. We address the problem of prediction and size reduction of massive data. We also discuss these approaches in the context of Big Data.The proposed approaches show promising and very competitive results compared to well-known algorithms, and lead to an improvement in the accuracy of the predictions on the data used.Then, we present our contributions, and propose a complete methodology for the prediction of wide time series. We also extend this methodology to big data via distributed computing and parallelism with an implementation of the prediction process proposed in the Hadoop / Spark environment. Prédiction des séries temporelles Apprentissage automatique Réduction de dimension Sélection de variables Big Data Time Series Prediction Variable Selection Dimension Reduction Machine Learning 004

Search results