Spelling suggestions: "subject:"datent representation"" "subject:"iatent representation""
1 |
Tackling Non-Stationarity in Reinforcement Learning via Latent Representation : An application to Intraday Foreign Exchange Trading / Att hantera icke-stationaritet i förstärkningsinlärning genom latent representation : En tillämpning på intradagshandel med valuta på Forex-marknadenMundo, Adriano January 2023 (has links)
Reinforcement Learning has applications in various domains, but the typical assumption is of a stationary process. Hence, when this hypothesis does not hold, performance may be sub-optimal. Tackling non-stationarity is not a trivial task because it requires adaptation to changing environments and predictability in various conditions, as dynamics and rewards might change over time. Meta Reinforcement Learning has been used to handle the non-stationary evolution of the environment while knowing the potential source of noise in the system. However, our research presents a novel method to manage such complexity by learning a suitable latent representation that captures relevant patterns for decision-making, improving the policy optimization procedure. We present a two-step framework that combines the unsupervised training of Deep Variational Auto-encoders to extract latent variables and a state-of-the-art model-free and off-policy Batch Reinforcement Learning algorithm called Fitted Q-Iteration, without relying on any assumptions about the environment dynamics. This framework is named Latent-Variable Fitted Q-Iteration (LV-FQI). Furthermore, to validate the generalization and robustness capabilities for exploiting the structure of the temporal sequence of time-series data and extracting near-optimal policies, we evaluated the performance with empirical experiments on synthetic data generated from classical financial models. We also tested it on Foreign Exchange trading scenarios with various degrees of non-stationarity and low signal-to-noise ratios. The results showed performance improvements compared to existing algorithms, indicating great promise for addressing the long-standing challenges of Continual Reinforcement Learning. / Reinforcement Learning har tillämpningar inom olika områden, men den typiska antagningen är att det rör sig om en stationär process. När detta antagande inte stämmer kan prestationen bli suboptimal. Att hantera icke-stationaritet är ingen enkel uppgift eftersom det kräver anpassning till föränderliga miljöer och förutsägbarhet under olika förhållanden, då dynamiken och belöningarna kan förändras över tiden. Meta Reinforcement Learning har använts för att hantera den icke-stationära utvecklingen av miljön genom att känna till potentiella källor till brus i systemet. Vår forskning presenterar emellertid en ny metod för att hantera en sådan komplexitet genom att lära en lämplig latent representation som fångar relevanta mönster för beslutsfattande och förbättrar optimeringsprocessen för policyn. Vi presenterar en tvåstegsramverk som kombinerar osuperviserad träning av Deep Variational Auto-encoders för att extrahera latenta variabler och en state-of-the-art model-free och off-policy Batch Reinforcement Learning-algoritm, Fitted Q-Iteration, utan att förlita sig på några antaganden om miljöns dynamik. Detta ramverk kallas Latent-Variable Fitted Q-Iteration (LV-FQI). För att validera generaliserings- och robusthetsförmågan att utnyttja strukturen hos den tidsmässiga sekvensen av tidsseriedata och extrahera nära-optimala policys utvärderade vi prestandan med empiriska experiment på syntetiska data genererade från klassiska finansiella modeller. Vi testade också det på handelsscenario för Foreign Exchange med olika grader av icke-stationaritet och låg signal-till-brus-förhållande. Resultaten visade prestandaförbättringar jämfört med befintliga algoritmer och indikerar stor potential för att tackla de långvariga utmaningarna inom kontinuerlig Reinforcement Learning.
|
2 |
Variational Autoencoder and Sensor Fusion for Robust Myoelectric ControlsCurrier, Keith A 01 January 2023 (has links) (PDF)
Myoelectric control schemes aim to utilize the surface electromyography (EMG) signals which are the electric potentials directly measured from skeletal muscles to control wearable robots such as exoskeletons and prostheses. The main challenge of myoelectric controls is to increase and preserve the signal quality by minimizing the effect of confounding factors such as muscle fatigue or electrode shift. Current research in myoelectric control schemes are developed to work in ideal laboratory conditions, but there is a persistent need to have these control schemes be more robust and work in real-world environments. Following the manifold hypothesis, complexity in the world can be broken down from a high-dimensional space to a lower-dimensional form or representation that can explain how the higher-dimensional real world operates. From this premise, the biological actions and their relevant multimodal signals can be compressed and optimally pertinent when performed in both laboratory and non-laboratory settings once the learned representation or manifold is discovered. This thesis outlines a method that incorporates the use of a contrastive variational autoencoder with an integrated classifier on multimodal sensor data to create a compressed latent space representation that can be used in future myoelectric control schemes.
|
3 |
Langevinized Ensemble Kalman Filter for Large-Scale Dynamic SystemsPeiyi Zhang (11166777) 26 July 2021 (has links)
<p>The Ensemble Kalman filter (EnKF) has achieved great successes in data assimilation in atmospheric and oceanic sciences, but its failure in convergence to the right filtering distribution precludes its use for uncertainty quantification. Other existing methods, such as particle filter or sequential importance sampler, do not scale well to the dimension of the system and the sample size of the datasets. In this dissertation, we address these difficulties in a coherent way.</p><p><br></p><p> </p><p>In the first part of the dissertation, we reformulate the EnKF under the framework of Langevin dynamics, which leads to a new particle filtering algorithm, the so-called Langevinized EnKF (LEnKF). The LEnKF algorithm inherits the forecast-analysis procedure from the EnKF and the use of mini-batch data from the stochastic gradient Langevin-type algorithms, which make it scalable with respect to both the dimension and sample size. We prove that the LEnKF converges to the right filtering distribution in Wasserstein distance under the big data scenario that the dynamic system consists of a large number of stages and has a large number of samples observed at each stage, and thus it can be used for uncertainty quantification. We reformulate the Bayesian inverse problem as a dynamic state estimation problem based on the techniques of subsampling and Langevin diffusion process. We illustrate the performance of the LEnKF using a variety of examples, including the Lorenz-96 model, high-dimensional variable selection, Bayesian deep learning, and Long Short-Term Memory (LSTM) network learning with dynamic data.</p><p><br></p><p> </p><p>In the second part of the dissertation, we focus on two extensions of the LEnKF algorithm. Like the EnKF, the LEnKF algorithm was developed for Gaussian dynamic systems containing no unknown parameters. We propose the so-called stochastic approximation- LEnKF (SA-LEnKF) for simultaneously estimating the states and parameters of dynamic systems, where the parameters are estimated on the fly based on the state variables simulated by the LEnKF under the framework of stochastic approximation. Under mild conditions, we prove the consistency of resulting parameter estimator and the ergodicity of the SA-LEnKF. For non-Gaussian dynamic systems, we extend the LEnKF algorithm (Extended LEnKF) by introducing a latent Gaussian measurement variable to dynamic systems. Those two extensions inherit the scalability of the LEnKF algorithm with respect to the dimension and sample size. The numerical results indicate that they outperform other existing methods in both states/parameters estimation and uncertainty quantification.</p>
|
Page generated in 0.1175 seconds