• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6
  • 6
  • 4
  • 3
  • 3
  • 1
  • 1
  • 1
  • Tagged with
  • 23
  • 23
  • 11
  • 9
  • 6
  • 6
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Statistical modelling of Bitcoin volatility : Has the sanctions on Russia had any effect on Bitcoin? / En statistisk modellering av Bitcoins volatilitet : Har sanktionerna mot Ryssland haft någon effekt på Bitcoin?

Schönbeck, Mathilda, Salman, Fatima January 2022 (has links)
This thesis aims to fit and compare different time series models namely the ARIMA-model, conditional heteroscedastic models and lastly a dynamic regression model with ARIMA error to Bitcoin closing price data that spans over 5 consecutive years. The purpose is to evaluate if the sanction on Russia had any effect on the cryptocurrency Bitcoin. After giving a very brief introduction to time series models and the nature of the error term, we describe the models that we want to compare. Quite early in on, autocorrelation was detected and that the time series were nonstationary. Additionally, as we are dealing with financial data, we found that the best alternative was to transform the data into logarithmic return and we then took the first difference. As we then detected a very large outlier, we decided to replace the extreme value with the mean of the two adjacent observations as we suspected it would affect the forecast interval. The dataset with first differenced log-returns was used in the ARIMA model but it turned out that there was no autocorrelation which indicated that returns in financial assets are uncorrelated across time and therefore unpredictable. The conditional heteroscedastic models, the ARCH and the GARCH models turned out to be best suitable for our data, as there was an ARCH-effect present. We could conclude that the GARCH(1,1) model using student t-distribution had the best fit, which had the lowest AIC and the highest log likelihood. In order to study the effect of the sanctions on Bitcoin volatility a dynamic regression model was used by allowing the error term to contain autocorrelation and include an independent dummy variable. The model showed that the Russian invasion of Ukraine did not, surprisingly, have any effect on the Bitcoin closing price.
12

The use of temporally aggregated data on detecting a structural change of a time series process

Lee, Bu Hyoung January 2016 (has links)
A time series process can be influenced by an interruptive event which starts at a certain time point and so a structural break in either mean or variance may occur before and after the event time. However, the traditional statistical tests of two independent samples, such as the t-test for a mean difference and the F-test for a variance difference, cannot be directly used for detecting the structural breaks because it is almost certainly impossible that two random samples exist in a time series. As alternative methods, the likelihood ratio (LR) test for a mean change and the cumulative sum (CUSUM) of squares test for a variance change have been widely employed in literature. Another point of interest is temporal aggregation in a time series. Most published time series data are temporally aggregated from the original observations of a small time unit to the cumulative records of a large time unit. However, it is known that temporal aggregation has substantial effects on process properties because it transforms a high frequency nonaggregate process into a low frequency aggregate process. In this research, we investigate the effects of temporal aggregation on the LR test and the CUSUM test, through the ARIMA model transformation. First, we derive the proper transformation of ARIMA model orders and parameters when a time series is temporally aggregated. For the LR test for a mean change, its test statistic is associated with model parameters and errors. The parameters and errors in the statistic should be changed when an AR(p) process transforms upon the mth order temporal aggregation to an ARMA(P,Q) process. Using the property, we propose a modified LR test when a time series is aggregated. Through Monte Carlo simulations and empirical examples, we show that the aggregation leads the null distribution of the modified LR test statistic being shifted to the left. Hence, the test power increases as the order of aggregation increases. For the CUSUM test for a variance change, we show that two aggregation terms will appear in the test statistic and have negative effects on test results when an ARIMA(p,d,q) process transforms upon the mth order temporal aggregation to an ARIMA(P,d,Q) process. Then, we propose a modified CUSUM test to control the terms which are interpreted as the aggregation effects. Through Monte Carlo simulations and empirical examples, the modified CUSUM test shows better performance and higher test powers to detect a variance change in an aggregated time series than the original CUSUM test. / Statistics
13

Estimativa de crescimento em altura de Leucena [Leucaena leucocephala (Lam.) de Wit.] por meio do modelo ARIMA

SILVA, Janilson Alves da 31 March 2008 (has links)
Submitted by (ana.araujo@ufrpe.br) on 2016-07-06T18:24:51Z No. of bitstreams: 1 Janilson Alves da Silva.pdf: 842304 bytes, checksum: 7b8a55eb0872b15792c1c68f9f36475b (MD5) / Made available in DSpace on 2016-07-06T18:24:51Z (GMT). No. of bitstreams: 1 Janilson Alves da Silva.pdf: 842304 bytes, checksum: 7b8a55eb0872b15792c1c68f9f36475b (MD5) Previous issue date: 2008-03-31 / The main objective of this work is to use models ARIMA to adjust the estimates of growth in height of leucaena ( emph () Leucaena leucocephala (Lam.) de Wit.) Agreste of Pernambuco. The experiment was conducted at the Experimental Station Company Research Pernambuco Agropecuária - IPA, the municipality of Caruaru - PE. 544 trees were used to leucaena variety, of Hawaii (cv. K8), divided into 24 treatments with 24 repetitions. The sources of variations were: levels of phosphorus, organic compounds and urban waste inoculation with rhizobia (NFB 466 and 473) applied alone. Were considered for this study 5 years of measurements and used the Chapman-Richards model to remove the trend in the series study, after removal of the new trend set S(t) was modeled using models ARIMA (1,1,0), (1,1,1) (1,1,2) and (1,1,3). However, the results were not superior to traditional non-linear models, often used in modeling the growth of forests. / O principal objetivo deste trabalho é utilizar modelos ARIMA para o ajuste das estimativas de crescimento em altura de leucena (Leucaena leucocephala (Lam.) de Wit.), no Agreste de Pernambuco. O experimento foi conduzido na Estação Experimental da Empresa Pernambucana de Pesquisa Agropecuária - IPA, no município de Caruaru - PE. Foram utilizadas 544 árvores de leucena, da variedade Hawaii (cv. K8), divididas em 24 tratamentos com 24 repetições. As fontes de variações estudadas foram: níveis de adubação fosfatada composto orgânico de resíduo urbano e inoculação de rizóbio (NFB 466 e 473) aplicadas isoladamente. Foram consideradas para esse estudo 5 anos de medições e utilizado o Modelo de Chapman-Richards para remover a tendência da série em estudo, após remoção da tendência a nova série St foi modelada utilizando modelos ARIMA (1,1,0);(1,1,1)(1,1,2) e (1,1,3). Entretanto, os resultados não foram superiores aos dos modelos não-lineares tradicionais, frequentemente usados na modelagem do crescimento de florestas.
14

AnÃlise Comparativa da AplicaÃÃo de Modelos para ImputaÃÃo do Volume MÃdio DiÃrio de SÃries HistÃricas de Volume de TrÃfego / COMPARATIVE ANALYSIS OF THE APPLICATION OF MODELS FOR THE IMPUTATION OF AVERAGE DAILY VOLUME OF TRAFFIC VOLUME TIME SERIES

Antonia Fabiana Marques Almeida 29 September 2010 (has links)
CoordenaÃÃo de AperfeiÃoamento de Pessoal de NÃvel Superior / Para melhorias do sistema rodoviÃrio, tanto no que se refere à infra-estrutura quanto à operaÃÃo, à necessÃrio a realizaÃÃo de estudos e planejamento, buscando a melhor utilizaÃÃo dos recursos existentes. Para tanto, faz-se o uso de uma importante medida de trÃfego, o volume veicular. Os dados de trÃfego sÃo coletados por meio manuais ou eletrÃnicos, porÃm, ambos podem apresentar falhas e nÃo coletar os dados em sua totalidade. No caso dos equipamentos eletrÃnicos de contagem, a coleta contÃnua pode formar uma sÃrie histÃrica, que, devido a nÃo coleta, gera falhas ao longo da base de dados, as quais podem comprometer os estudos embasados nestas informaÃÃes. Este trabalho busca, portanto, realizar anÃlises de mÃtodos empregados para estimaÃÃo destes valores faltosos, buscando conhecer o modelo mais eficaz para a variÃvel Volume MÃdio DiÃrio dos dados obtidos pelos postos de contagem contÃnua instalados nas rodovias estaduais do CearÃ. Os modelos de estimaÃÃo aplicados neste trabalho sÃo os modelos ARIMA de anÃlise de sÃries temporais, e modelos simples, que apresentam aplicaÃÃo menos complexa e processamento mais rÃpido, enquanto que o ARIMA demanda maior conhecimento especÃfico do profissional que o utiliza. Assim, o mÃtodo mais eficaz aqui considerado foi o que obteve menores erros apÃs aplicaÃÃo do modelo. Para estas aplicaÃÃes foram selecionados quatro postos permanentes, de acordo com o percentual de dados vÃlidos e sua localizaÃÃo, buscando a utilizaÃÃo de postos em pontos representativos do estado. O melhor modelo encontrado foi o ARIMA (1,0,1)7 (com erro mÃdio de 1,816%), porÃm, um dos modelos simples, o MS2, obteve resultados prÃximos aos do ARIMA (erro mÃdio 1,837%), e tambÃm pode ser considerado satisfatÃrio para aplicaÃÃo na imputaÃÃo de valores faltosos. / In order to improve the road system, with regard to its infrastructure and operation, it is necessary to perform studies and planning, by seeking the best use of existing resources. Therefore an important traffic measure is used, i.e., vehicle volume. Traffic data is collected either manually or electronically; however both ways can fail and not collect all data. In the case of electronic counting equipment, the continuous data collection may form a time series, which produces failures in the database due to non-collection, which can compromise the studies based on this information. Therefore this work aims to perform analysis of methods used to estimate these missing values, by trying to know the most effective model for the Average Daily Volume variable of the data obtained by the continuous counting stations installed in the state highways of CearÃ. The estimation models used in this work are the ARIMA models for time series analysis, and simple models, which present a less complex application and a faster processing, while the ARIMA requires more specific knowledge of the professional who uses it. The most effective method considered herein was the one that obtained smaller errors after the application of the models. Four permanent counting stations were selected for these applications, according to the percentage of valid data and its location, by seeking the use of stations in representative points of the state. The best model found was ARIMA (1,0,1)7 (with an average error of 1.816%), however one of the simplest models, MS2, produced results similar to those of ARIMA (an average error of 1.837%), and it can also be considered suitable for application in the allocation of missing values.
15

台灣失業率的預測-季節性ARIMA與介入模式的比較 / Forecasting Taiwan’s Unemployment Rate –A Comparison Between Seasonal ARIMA and the Intervention Model

胡文傑 Unknown Date (has links)
本論文採用了由Box and Jenkins(1976)所提出的ARIMA模型,以及由BOX and Tiao(1975)所提出的Intervention Model,去配適台灣的失業率型態,以及比較其預測的結果。 結果顯示出台灣的失業率具有季節性的型態,亦即台灣的失業率並非僅僅受到月分之間的相關,年分之間也有所關連。是故,當本論文在預測失業率的水準時,也考慮到此一因素,加入季節性的ARIMA模型對台灣的失業率加以預測。另外,時間序列的資料常常受到外生因素的干擾。對於失業率來說,政策上的改變將會影響失業率本身的結構,因此利用介入模式預測失業率,可以得到一組較精確的預測值。介入模式的事件有以下五個,分別是解嚴、六年國建、台灣引進外勞、中共飛彈試射、新十大建設。前四個事件的確影響了失業率的結構,不過第五項,也就是新十大建設並沒有顯著影響失業率的結構。理由可能是新十大建設的內容並不能合宜的解決經濟上與社會上的問題,以及這些建設尚未完工,以致無法達到期預期的效果。 比較兩模型的預測結果時,採用了MPE、MSE、MAE、MAPE作為模型評估的準則,結果指出介入模式的預測結果比起季節性ARIMA的預測結果來的有效率。 / This article adopts the ARIMA model, which was first introduced by Box and Jenkins (1976), and the intervention model, which was developed by Box and Tiao (1975), to fit the time series data for the unemployment rate in Taiwan, and thus to compare the results of the forecasts. The results reveal that there is a seasonal effect in the data on the unemployment rate. This indicates that the unemployment rate figures are not only related from month to month but are also related from year to year. When forecasting the level of unemployment, we should examine not only the neighboring months but also the corresponding months in the previous year. Time series are frequently affected by certain external events. In the discussion on the unemployment rate, the policies implemented by the government as well as military threats indeed influence the structure of the series. By making a forecast using the intervention model, we can evaluate the effect of the external events which would give rise to more accurate forecasts. In this study, there were five interventions included in relation to the unemployment rate series, which were as follows. First, the lifting of Martial Law in February 1987. Second, the Six-year National Development Plan launched in June 1991. Third, the hiring of foreign labor in Taiwan, which took effect in October 1991. Fourth, the threats of missile tests from the PRC in Feb 1996. Fifth, the ten new construction programs launched in November 2003. The first four events were indeed found to give rise to a structural change in the unemployment rate series at the moment when they occurred. This result might also have implied that not all of the actual effect of expansionary policies could have exactly decreased the unemployment rate, and therefore have solved the economic and social problems simultaneously. When we refer to the comparison of the above two models, the ultimate choice of a model may depend on its goodness of fit, such as the residual mean square, AIC, or BIC. As the main purpose of this study is to forecast future values, the alternative criteria for model selection can be based on forecast errors. The comparison is based on statistics such as MPE, MSE, MAE and MAPE. The results indicate that the intervention model outperforms the seasonal ARIMA model.
16

結構性改變ARIMA模式的建立與應用 / Structural Change ARIMA Modeling and Application

曾淑惠, Tseng, Shuhui Unknown Date (has links)
近年來,非線性時間數列分析是一個快速發展的課題,其中最為人所矚目 的是門檻模式。從過去許多文獻得知,一個簡單門檻模式對於某些型態時 間數列的描述,如結構性改變的行為趨勢,比一般線性ARMA模式更能解釋 實際情況。在本篇論文中,我們將討論有關門檻模式及結構性改變分析的 問題。對於模式的建立,我們提出一個轉型期的觀念,替代傳統尋求一個 轉捩點的方法,進而提出一個結構性改變ARIMA模式有效建立的程序。最 後,我們以台灣出生率當作應用分析的範例,並且利用建立的結構性改變 ARIMA模式,及其他傳統門檻TAR模式,傳統線性分析方法等進行預測分析 及比較。 / Non-linear time series analysis is a rapidly developing subject in recent years. One of special families of non-linear models is threshold model. Many literatures have shown that even simple threshold model can describe certain types of time series, such as structural change behavior, more faithful than using linear ARMA models. In this paper, we discuss some problems about the threshold model and structural change analysis. Instead of finding the change point, we present the change period concepts on the model- building. An efficient algorithem on constructing the structure change ARIMA models is proposed. Finally, we demonstrate an example about the birth rate of Taiwan, and the comparison of forecasting performance for the structure change ARIMA model with alternative models are also made.
17

Modelagem matemática para consciência financeira e a bolsa de valores

Sampaio Júnior, Roberto Antônio de Oliveira January 2018 (has links)
Orientador: Prof. Dr. André Ricardo Oliveira da Fonseca / Dissertação (mestrado) - Universidade Federal do ABC, Programa de Pós-Graduação em Matemática , Santo André, 2018. / O intuito desse trabalho é fomentar o estudo da matemática financeira com o objetivo de um impacto social, para que os alunos de baixa renda atinjam uma consciência financeira maior durante sua formação escolar e construção de sua família. Esse estudo tem motivação pessoal e também éj ustificado pela falta de interesse dos alunos em assuntos de Álgebra, Lógica e Abstração. Através de modelos financeiros da modelagem matemática e de ferramentas computacionais, apresentados na forma de atividades para o Ensino Médio, espera-se uma conscientização maior do aluno em relação à sua liberdade financeira. / The purpose of this work is to promote the study of financial mathematics with the objective of a social impact so that the students of low income achieve a greater financial consistency during their school formation and construction of their family. This study has personal motivation and is also justified by students¿ lack of interest in Algebra, Logic, and Abstraction. Through financial models, mathematical modeling and computational tools, presented in the form of activities for High School, it is expected that students will become more aware of their financial freedom.
18

Elasticity in IaaS Cloud, Preserving Performance SLAs

Dhingra, Mohit January 2014 (has links) (PDF)
Infrastructure-as-a-Service(IaaS), one of the service models of cloud computing, provides resources in the form of Virtual Machines(VMs). Many applications hosted on the IaaS cloud have time varying workloads. These kind of applications benefit from the on-demand provision ing characteristic of cloud platforms. Applications with time varying workloads demand time varying resources in IaaS, which requires elastic resource provisioning in IaaS, such that their performance is intact. In current IaaS cloud systems, VMs are static in nature as their configurations do not change once they are instantiated. Therefore, fluctuation in resource demand is handled in two ways: allocating more VMs to the application(horizontal scaling) or migrating the application to another VM with a different configuration (vertical scaling). This forces the customers to characterize their workloads at a coarse grained level which potentially leads to under-utilized VM resources or under performing application. Furthermore, the current IaaS architecture does not provide performance guarantees to applications, because of two major factors: 1)Performance metrics of the application are not used for resource allocation mechanisms by the IaaS, 2) Current resource allocation mechanisms do not consider virtualization overheads, can significantly impact the application’s performance, especially for I/O workloads. In this work, we develop an Elastic Resource Framework for IaaS, which provides flexible resource provisioning mechanism and at the same time preserves performance of applications specified by the Service Level Agreement(SLA). For identification of workloads which needs elastic resource allocation, variability has been defined as a metric and is associated with the definition of elasticity of a resource allocation system. We introduce new components Forecasting Engine based on a Cost Model and Resource manager in Open Nebula IaaS cloud, which compute a n optimal resource requirement for the next scheduling cycle based on prediction. Scheduler takes this as an input and enables fine grained resource allocation by dynamically adjusting the size of the VM. Since the prediction may not always be entirely correct, there might be under-allocation or over-allocation of resources based on forecast errors. The design of the cost model accounts for both over-allocation of resources and SLA violations caused by under-allocation of resources. Also, proper resource allocation requires consideration of the virtualization overhead, which is not captured by current monitoring frameworks. We modify existing monitoring frameworks to monitor virtualization over head and provide fine-grained monitoring information in the Virtual Machine Monitor (VMM) as well as VMs. In our approach, the performance of the application is preserved by 1)binding the application level performance SLA store source allocation, and 2) accounting for virtualization over-head while allocating resources. The proposed framework is implemented using the forecasting strategies like Seasonal Auto Regressive and Moving Average model (Seasonal ARIMA), and Gaussian Process model. However, this framework is generic enough to use any other forecasting strategy as well. It is applied to the real workloads, namely web server and mail server workloads, obtained through Supercomputer Education and Research Centre, Indian Institute of Science. The results show that significant reduction in the resource requirements can be obtained while preserving the performance of application by restricting the SLA violations. We further show that more intelligent scaling decisions can be taken using the monitoring information derived by the modification in monitoring framework.
19

Risk Management In Reservoir Operations In The Context Of Undefined Competitive Consumption

Salami, Yunus 01 January 2012 (has links)
Dams and reservoirs with multiple purposes require effective management to fully realize their purposes and maximize efficiency. For instance, a reservoir intended mainly for the purposes of flood control and hydropower generation may result in a system with primary objectives that conflict with each other. This is because higher hydraulic heads are required to achieve the hydropower generation objective while relatively lower reservoir levels are required to fulfill flood control objectives. Protracted imbalances between these two could increase the susceptibility of the system to risks of water shortage or flood, depending on inflow volumes and operational policy effectiveness. The magnitudes of these risks can become even more pronounced when upstream use of the river is unregulated and uncoordinated so that upstream consumptions and releases are arbitrary. As a result, safe operational practices and risk management alternatives must be structured after an improved understanding of historical and anticipated inflows, actual and speculative upstream uses, and the overall hydrology of catchments upstream of the reservoir. One of such systems with an almost yearly occurrence of floods and shortages due to both natural and anthropogenic factors is the dual reservoir system of Kainji and Jebba in Nigeria. To analyze and manage these risks, a methodology that combines a stochastic and deterministic approach was employed. Using methods outlined by Box and Jenkins (1976), autoregressive integrated moving average (ARIMA) models were developed for forecasting Niger river inflows at Kainji reservoir based on twenty-seven-year-long historical inflow data (1970-1996). These were then validated using seven-year inflow records (1997-2003). The model with the best correlation was a seasonal multiplicative ARIMA (2,1,1)x(2,1,2)12 model. Supplementary iv validation of this model was done with discharge rating curves developed for the inlet of the reservoir using in situ inflows and satellite altimetry data. By comparing net inflow volumes with storage deficit, flood and shortage risk factors at the reservoir were determined based on (a) actual inflows, (b) forecasted inflows (up to 2015), and (c) simulated scenarios depicting undefined competitive upstream consumption. Calculated highrisk years matched actual flood years again suggesting the reliability of the model. Monte Carlo simulations were then used to prescribe safe outflows and storage allocations in order to reduce futuristic risk factors. The theoretical safety levels achieved indicated risk factors below threshold values and showed that this methodology is a powerful tool for estimating and managing flood and shortage risks in reservoirs with undefined competitive upstream consumption
20

A Statistical Methodology for Classifying Time Series in the Context of Climatic Data

Ramírez Buelvas, Sandra Milena 24 February 2022 (has links)
[ES] De acuerdo con las regulaciones europeas y muchos estudios científicos, es necesario monitorear y analizar las condiciones microclimáticas en museos o edificios, para preservar las obras de arte en ellos. Con el objetivo de ofrecer herramientas para el monitoreo de las condiciones climáticas en este tipo de edificios, en esta tesis doctoral se propone una nueva metodología estadística para clasificar series temporales de parámetros climáticos como la temperatura y humedad relativa. La metodología consiste en aplicar un método de clasificación usando variables que se computan a partir de las series de tiempos. Los dos primeros métodos de clasificación son versiones conocidas de métodos sparse PLS que no se habían aplicado a datos correlacionados en el tiempo. El tercer método es una nueva propuesta que usa dos algoritmos conocidos. Los métodos de clasificación se basan en diferentes versiones de un método sparse de análisis discriminante de mínimos cuadra- dos parciales PLS (sPLS-DA, SPLSDA y sPLS) y análisis discriminante lineal (LDA). Las variables que los métodos de clasificación usan como input, corresponden a parámetros estimados a partir de distintos modelos, métodos y funciones del área de las series de tiempo, por ejemplo, modelo ARIMA estacional, modelo ARIMA- TGARCH estacional, método estacional Holt-Winters, función de densidad espectral, función de autocorrelación (ACF), función de autocorrelación parcial (PACF), rango móvil (MR), entre otras funciones. También fueron utilizadas algunas variables que se utilizan en el campo de la astronomía para clasificar estrellas. En los casos que a priori no hubo información de los clusters de las series de tiempos, las dos primeras componentes de un análisis de componentes principales (PCA) fueron utilizadas por el algoritmo k- means para identificar posibles clusters de las series de tiempo. Adicionalmente, los resultados del método sPLS-DA fueron comparados con los del algoritmo random forest. Tres bases de datos de series de tiempos de humedad relativa o de temperatura fueron analizadas. Los clusters de las series de tiempos se analizaron de acuerdo a diferentes zonas o diferentes niveles de alturas donde fueron instalados sensores para el monitoreo de las condiciones climáticas en los 3 edificios.El algoritmo random forest y las diferentes versiones del método sparse PLS fueron útiles para identificar las variables más importantes en la clasificación de las series de tiempos. Los resultados de sPLS-DA y random forest fueron muy similares cuando se usaron como variables de entrada las calculadas a partir del método Holt-Winters o a partir de funciones aplicadas a las series de tiempo. Aunque los resultados del método random forest fueron levemente mejores que los encontrados por sPLS-DA en cuanto a las tasas de error de clasificación, los resultados de sPLS- DA fueron más fáciles de interpretar. Cuando las diferentes versiones del método sparse PLS utilizaron variables resultantes del método Holt-Winters, los clusters de las series de tiempo fueron mejor discriminados. Entre las diferentes versiones del método sparse PLS, la versión sPLS con LDA obtuvo la mejor discriminación de las series de tiempo, con un menor valor de la tasa de error de clasificación, y utilizando el menor o segundo menor número de variables.En esta tesis doctoral se propone usar una versión sparse de PLS (sPLS-DA, o sPLS con LDA) con variables calculadas a partir de series de tiempo para la clasificación de éstas. Al aplicar la metodología a las distintas bases de datos estudiadas, se encontraron modelos parsimoniosos, con pocas variables, y se obtuvo una discriminación satisfactoria de los diferentes clusters de las series de tiempo con fácil interpretación. La metodología propuesta puede ser útil para caracterizar las distintas zonas o alturas en museos o edificios históricos de acuerdo con sus condiciones climáticas, con el objetivo de prevenir problemas de conservación con las obras de arte. / [CA] D'acord amb les regulacions europees i molts estudis científics, és necessari monitorar i analitzar les condiciones microclimàtiques en museus i en edificis similars, per a preservar les obres d'art que s'exposen en ells. Amb l'objectiu d'oferir eines per al monitoratge de les condicions climàtiques en aquesta mena d'edificis, en aquesta tesi es proposa una nova metodologia estadística per a classificar series temporals de paràmetres climàtics com la temperatura i humitat relativa.La metodologia consisteix a aplicar un mètode de classificació usant variables que es computen a partir de les sèries de temps. Els dos primers mètodes de classificació són versions conegudes de mètodes sparse PLS que no s'havien aplicat adades correlacionades en el temps. El tercer mètode és una nova proposta que usados algorismes coneguts. Els mètodes de classificació es basen en diferents versions d'un mètode sparse d'anàlisi discriminant de mínims quadrats parcials PLS (sPLS-DA, SPLSDA i sPLS) i anàlisi discriminant lineal (LDA). Les variables queels mètodes de classificació usen com a input, corresponen a paràmetres estimats a partir de diferents models, mètodes i funcions de l'àrea de les sèries de temps, per exemple, model ARIMA estacional, model ARIMA-TGARCH estacional, mètode estacional Holt-Winters, funció de densitat espectral, funció d'autocorrelació (ACF), funció d'autocorrelació parcial (PACF), rang mòbil (MR), entre altres funcions. També van ser utilitzades algunes variables que s'utilitzen en el camp de l'astronomia per a classificar estreles. En els casos que a priori no va haver-hi información dels clústers de les sèries de temps, les dues primeres components d'una anàlisi de components principals (PCA) van ser utilitzades per l'algorisme k-means per a identificar possibles clústers de les sèries de temps. Addicionalment, els resultats del mètode sPLS-DA van ser comparats amb els de l'algorisme random forest.Tres bases de dades de sèries de temps d'humitat relativa o de temperatura varen ser analitzades. Els clústers de les sèries de temps es van analitzar d'acord a diferents zones o diferents nivells d'altures on van ser instal·lats sensors per al monitoratge de les condicions climàtiques en els edificis.L'algorisme random forest i les diferents versions del mètode sparse PLS van ser útils per a identificar les variables més importants en la classificació de les series de temps. Els resultats de sPLS-DA i random forest van ser molt similars quan es van usar com a variables d'entrada les calculades a partir del mètode Holt-winters o a partir de funcions aplicades a les sèries de temps. Encara que els resultats del mètode random forest van ser lleument millors que els trobats per sPLS-DA quant a les taxes d'error de classificació, els resultats de sPLS-DA van ser més fàcils d'interpretar.Quan les diferents versions del mètode sparse PLS van utilitzar variables resultants del mètode Holt-Winters, els clústers de les sèries de temps van ser més ben discriminats. Entre les diferents versions del mètode sparse PLS, la versió sPLS amb LDA va obtindre la millor discriminació de les sèries de temps, amb un menor valor de la taxa d'error de classificació, i utilitzant el menor o segon menor nombre de variables.En aquesta tesi proposem usar una versió sparse de PLS (sPLS-DA, o sPLS amb LDA) amb variables calculades a partir de sèries de temps per a classificar series de temps. En aplicar la metodologia a les diferents bases de dades estudiades, es van trobar models parsimoniosos, amb poques variables, i varem obtindre una discriminació satisfactòria dels diferents clústers de les sèries de temps amb fácil interpretació. La metodologia proposada pot ser útil per a caracteritzar les diferents zones o altures en museus o edificis similars d'acord amb les seues condicions climàtiques, amb l'objectiu de previndre problemes amb les obres d'art. / [EN] According to different European Standards and several studies, it is necessary to monitor and analyze the microclimatic conditions in museums and similar buildings, with the goal of preserving artworks. With the aim of offering tools to monitor the climatic conditions, a new statistical methodology for classifying time series of different climatic parameters, such as relative humidity and temperature, is pro- posed in this dissertation.The methodology consists of applying a classification method using variables that are computed from time series. The two first classification methods are ver- sions of known sparse methods which have not been applied to time dependent data. The third method is a new proposal that uses two known algorithms. These classification methods are based on different versions of sparse partial least squares discriminant analysis PLS (sPLS-DA, SPLSDA, and sPLS) and Linear Discriminant Analysis (LDA). The variables that are computed from time series, correspond to parameter estimates from functions, methods, or models commonly found in the area of time series, e.g., seasonal ARIMA model, seasonal ARIMA-TGARCH model, seasonal Holt-Winters method, spectral density function, autocorrelation function (ACF), partial autocorrelation function (PACF), moving range (MR), among others functions. Also, some variables employed in the field of astronomy (for classifying stars) were proposed.The methodology proposed consists of two parts. Firstly, different variables are computed applying the methods, models or functions mentioned above, to time series. Next, once the variables are calculated, they are used as input for a classification method like sPLS-DA, SPLSDA, or SPLS with LDA (new proposal). When there was no information about the clusters of the different time series, the first two components from principal component analysis (PCA) were used as input for k-means method for identifying possible clusters of time series. In addition, results from random forest algorithm were compared with results from sPLS-DA.This study analyzed three sets of time series of relative humidity or temperate, recorded in different buildings (Valencia's Cathedral, the archaeological site of L'Almoina, and the baroque church of Saint Thomas and Saint Philip Neri) in Valencia, Spain. The clusters of the time series were analyzed according to different zones or different levels of the sensor heights, for monitoring the climatic conditions in these buildings.Random forest algorithm and different versions of sparse PLS helped identifying the main variables for classifying the time series. When comparing the results from sPLS-DA and random forest, they were very similar for variables from seasonal Holt-Winters method and functions which were applied to the time series. The results from sPLS-DA were easier to interpret than results from random forest. When the different versions of sparse PLS used variables from seasonal Holt- Winters method as input, the clusters of the time series were identified effectively.The variables from seasonal Holt-Winters helped to obtain the best, or the second best results, according to the classification error rate. Among the different versions of sparse PLS proposed, sPLS with LDA helped to classify time series using a fewer number of variables with the lowest classification error rate.We propose using a version of sparse PLS (sPLS-DA, or sPLS with LDA) with variables computed from time series for classifying time series. For the different data sets studied, the methodology helped to produce parsimonious models with few variables, it achieved satisfactory discrimination of the different clusters of the time series which are easily interpreted. This methodology can be useful for characterizing and monitoring micro-climatic conditions in museums, or similar buildings, for preventing problems with artwork. / I gratefully acknowledge the financial support of Pontificia Universidad Javeriana Cali – PUJ and Instituto Colombiano de Crédito Educativo y Estudios Técnicos en el Exterior – ICETEX who awarded me the scholarships ’Convenio de Capacitación para Docentes O. J. 086/17’ and ’Programa Crédito Pasaporte a la Ciencia ID 3595089 foco-reto salud’ respectively. The scholarships were essential for obtaining the Ph.D. Also, I gratefully acknowledge the financial support of the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 814624. / Ramírez Buelvas, SM. (2022). A Statistical Methodology for Classifying Time Series in the Context of Climatic Data [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/181123

Page generated in 0.033 seconds