This thesis establishes methods to quantify and explain uncertainty through high-order moments in time series data, along with first principal-based improvements on the standard autoencoder and variational autoencoder. While the first-principal improvements on the standard variational autoencoder provide additional means of explainability, we ultimately look to non-variational methods for quantifying uncertainty under the autoencoder framework.
We utilize Shannon's differential entropy to accomplish the task of uncertainty quantification in a general nonlinear and non-Gaussian setting. Together with previously established connections between autoencoders and principal component analysis, we motivate the focus on differential entropy as a proper abstraction of principal component analysis to this more general framework, where nonlinear and non-Gaussian characteristics in the data are permitted.
Furthermore, we are able to establish explicit connections between high-order moments in the data to those in the latent space, which induce a natural latent space decomposition, and by extension, an explanation of the estimated uncertainty. The proposed methods are intended to be utilized in economic and financial factor models in state space form, building on recent developments in the application of neural networks to factor models with applications to financial and economic time series analysis. Finally, we demonstrate the efficacy of the proposed methods on high frequency hourly foreign exchange rates, macroeconomic signals, and synthetically generated autoregressive data sets. / Master of Science / This thesis establishes methods to quantify and explain uncertainty in time series data, along with improvements on some latent variable neural networks called autoencoders and variational autoencoders. Autoencoders and varitational autoencodes are called latent variable neural networks since they can estimate a representation of the data that has less dimension than the original data. These neural network architectures have a fundamental connection to a classical latent variable method called principal component analysis, which performs a similar task of dimension reduction but under more restrictive assumptions than autoencoders and variational autoencoders. In contrast to principal component analysis, a common ailment of neural networks is the lack of explainability, which accounts for the colloquial term black-box models. While the improvements on the standard autoencoders and variational autoencoders help with the problem of explainability, we ultimately look to alternative probabilistic methods for quantifying uncertainty. To accomplish this task, we focus on Shannon's differential entropy, which is entropy applied to continuous domains such as time series data. Entropy is intricately connected to the notion of uncertainty, since it depends on the amount of randomness in the data. Together with previously established connections between autoencoders and principal component analysis, we motivate the focus on differential entropy as a proper abstraction of principal component analysis to a general framework that does not require the restrictive assumptions of principal component analysis.
Furthermore, we are able to establish explicit connections between high-order moments in the data to the estimated latent variables (i.e., the reduced dimension representation of the data). Estimating high-order moments allows for a more accurate estimation of the true distribution of the data. By connecting the estimated high-order moments in the data to the latent variables, we obtain a natural decomposition of the uncertainty surrounding the latent variables, which allows for increased explainability of the proposed autoencoder. The methods introduced in this thesis are intended to be utilized in a class of economic and financial models called factor models, which are frequently used in policy and investment analysis.
A factor model is another type of latent variable model, which in addition to estimating a reduced dimension representation of the data, provides a means to forecast future observations. Finally, we demonstrate the efficacy of the proposed methods on high frequency hourly foreign exchange rates, macroeconomic signals, and synthetically generated autoregressive data sets. The results support the superiority of the entropy-based autoencoder to the standard variational autoencoder both in capability and computational expense.
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/111804 |
Date | 12 September 2022 |
Creators | Miller, Dawson Jon |
Contributors | Mathematics, Embree, Mark P., Habibnia, Ali, Hewett, Russell Joseph |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Language | English |
Detected Language | English |
Type | Thesis |
Format | ETD, application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.003 seconds