Global ETD Search

11	Feature Selection for Value Function Approximation Taylor, Gavin January 2011 (has links) <p>The field of reinforcement learning concerns the question of automated action selection given past experiences. As an agent moves through the state space, it must recognize which state choices are best in terms of allowing it to reach its goal. This is quantified with value functions, which evaluate a state and return the sum of rewards the agent can expect to receive from that state. Given a good value function, the agent can choose the actions which maximize this sum of rewards. Value functions are often chosen from a linear space defined by a set of features; this method offers a concise structure, low computational effort, and resistance to overfitting. However, because the number of features is small, this method depends heavily on these few features being expressive and useful, making the selection of these features a core problem. This document discusses this selection.</p><p>Aside from a review of the field, contributions include a new understanding of the role approximate models play in value function approximation, leading to new methods for analyzing feature sets in an intuitive way, both using the linear and the related kernelized approximation architectures. Additionally, we present a new method for automatically choosing features during value function approximation which has a bounded approximation error and produces superior policies, even in extremely noisy domains.</p> / Dissertation Artificial Intelligence Computer Science Feature Selection Reinforcement Learning Value Function Approximation
12	A Study on Architecture, Algorithms, and Applications of Approximate Dynamic Programming Based Approach to Optimal Control Lee, Jong Min 12 July 2004 (has links) This thesis develops approximate dynamic programming (ADP) strategies suitable for process control problems aimed at overcoming the limitations of MPC, which are the potentially exorbitant on-line computational requirement and the inability to consider the future interplay between uncertainty and estimation in the optimal control calculation. The suggested approach solves the DP only for the state points visited by closed-loop simulations with judiciously chosen control policies. The approach helps us combat a well-known problem of the traditional DP called 'curse-of-dimensionality,' while it allows the user to derive an improved control policy from the initial ones. The critical issue of the suggested method is a proper choice and design of function approximator. A local averager with a penalty term is proposed to guarantee a stably learned control policy as well as acceptable on-line performance. The thesis also demonstrates versatility of the proposed ADP strategy with difficult process control problems. First, a stochastic adaptive control problem is presented. In this application an ADP-based control policy shows an "active" probing property to reduce uncertainties, leading to a better control performance. The second example is a dual-mode controller, which is a supervisory scheme that actively prevents the progression of abnormal situations under a local controller at their onset. Finally, two ADP strategies for controlling nonlinear processes based on input-output data are suggested. They are model-based and model-free approaches, and have the advantage of conveniently incorporating the knowledge of identification data distribution into the control calculation with performance improvement. Penalty function Reinforcement learning Neuro-dynamic programming Model predictive control Function approximation
13	Example Based Processing For Image And Video Synthesis Haro, Antonio 25 November 2003 (has links) The example based processing problem can be expressed as: "Given an example of an image or video before and after processing, apply a similar processing to a new image or video". Our thesis is that there are some problems where a single general algorithm can be used to create varieties of outputs, solely by presenting examples of what is desired to the algorithm. This is valuable if the algorithm to produce the output is non-obvious, e.g. an algorithm to emulate an example painting's style. We limit our investigations to example based processing of images, video, and 3D models as these data types are easy to acquire and experiment with. We represent this problem first as a texture synthesis influenced sampling problem, where the idea is to form feature vectors representative of the data and then sample them coherently to synthesize a plausible output for the new image or video. Grounding the problem in this manner is useful as both problems involve learning the structure of training data under some assumptions to sample it properly. We then reduce the problem to a labeling problem to perform example based processing in a more generalized and principled manner than earlier techniques. This allows us to perform a different estimation of what the output should be by approximating the optimal (and possibly not known) solution through a different approach. Markov random fields Image-based rendering Learning from example Filter approximation Function approximation
14	Gradient Temporal-Difference Learning Algorithms Maei, Hamid Reza Unknown Date No description available. Reinforcement Learning Temporal-Difference learning Stochastic Gradient-Descent Value Function Approximation Policy Evaluation
15	Aprendizado por reforço utilizando tile coding em cenários multiagente / Reinforcement learning using tile coding in multiagent scenarios Waskow, Samuel Justo January 2010 (has links) Atualmente pesquisadores de inteligência artificial buscam métodos para solucionar problemas de aprendizado por reforço que estão associados a uma grande quantidade de recursos computacionais. Em cenários multiagentes onde os espaços de estados e ações possuem alta dimensionalidade, as abordagens tradicionais de aprendizado por reforço são inadequadas. Como alternativa existem técnicas de generalização do espaço de estados que ampliam a capacidade de aprendizado através de abstrações. Desta maneira, o foco principal deste trabalho é utilizar as técnicas existentes de aprendizado por reforço com aproximação de funções através de tile coding para aplicação nos seguintes cenários: presa-predador, controle de tráfego veicular urbano e jogos de coordenação. Os resultados obtidos nos experimentos demonstram que a representação de estados por tile coding tem desempenho superior à representação tabular. / Nowadays, researchers are seeking methods to solve reinforcement learning (RL) problems in complex scenarios. RL is an efficient, widely used machine learning technique in single-agent problems. Regarding multiagent systems, in which the state space generally has high dimensionality, standard reinforcement learning approaches may not be adequate. As alternatives, it is possible to use techniques that generalize the state space to enhance the ability of the agents to learn through the use of abstraction. Thus, the focus of this work is to use an existing reinforcement learning technique, namely tile coding, that is a better form of state representation. This kind of method is key in scenarios where agents have a high number of states to explore. In the scenarios used to test and validate this approach, our experimental results indicate that the tile coding state representation outperforms the tabular one. Inteligência artificial Sistemas multiagentes Agentes inteligentes Artificial intelligence Multiagent systems Reinforcement learning Function approximation
16	Aprendizado por reforço utilizando tile coding em cenários multiagente / Reinforcement learning using tile coding in multiagent scenarios Waskow, Samuel Justo January 2010 (has links) Atualmente pesquisadores de inteligência artificial buscam métodos para solucionar problemas de aprendizado por reforço que estão associados a uma grande quantidade de recursos computacionais. Em cenários multiagentes onde os espaços de estados e ações possuem alta dimensionalidade, as abordagens tradicionais de aprendizado por reforço são inadequadas. Como alternativa existem técnicas de generalização do espaço de estados que ampliam a capacidade de aprendizado através de abstrações. Desta maneira, o foco principal deste trabalho é utilizar as técnicas existentes de aprendizado por reforço com aproximação de funções através de tile coding para aplicação nos seguintes cenários: presa-predador, controle de tráfego veicular urbano e jogos de coordenação. Os resultados obtidos nos experimentos demonstram que a representação de estados por tile coding tem desempenho superior à representação tabular. / Nowadays, researchers are seeking methods to solve reinforcement learning (RL) problems in complex scenarios. RL is an efficient, widely used machine learning technique in single-agent problems. Regarding multiagent systems, in which the state space generally has high dimensionality, standard reinforcement learning approaches may not be adequate. As alternatives, it is possible to use techniques that generalize the state space to enhance the ability of the agents to learn through the use of abstraction. Thus, the focus of this work is to use an existing reinforcement learning technique, namely tile coding, that is a better form of state representation. This kind of method is key in scenarios where agents have a high number of states to explore. In the scenarios used to test and validate this approach, our experimental results indicate that the tile coding state representation outperforms the tabular one. Inteligência artificial Sistemas multiagentes Agentes inteligentes Artificial intelligence Multiagent systems Reinforcement learning Function approximation
17	Análise do efeito do jitter de fase na operação de malhas de sincronismo de fase. / Analysis of phase-jitter effect in the operation of phase-locked loops. Elisa Yoshiko Takada 12 April 2006 (has links) O jitter de fase é um fenômeno inerente nos sistemas elétricos. O crescente interesse pelo jitter deve-se à degradação que causa em sistemas de transmissão de alta velocidade. Seus efeitos fazem-se sentir ao afetar o processo de recuperação de dados, causando aumento na taxa de erros por bit. Neste trabalho, o jitter é modelado como uma perturbação periódica e seu efeito na operação de PLLs é analisado. Deduzimos uma fórmula para o cálculo da amplitude do jitter envolvendo somente os parâmetros do PLL e do jitter e identificamos as regiões do espaço de parâmetros com os comportamentos dinâmicos do PLL. / Phase jitter or timing jitter is an inherent phenomenum on electrical systems. Jitter growing interest is due to degradation it causes in high-speed transmission systems. It affects the data recovering process and it causes an increase in the bit error rate. In this work, jitter is modelled as a periodic perturbation and its effects in the operation of a PLL are analysed. We deduce a formula that measures jitter amplitude by PLL and jitter parameters and we identify the regions of parameter space according to the system dynamical behaviour. aproximação de função redes de telecomunicações sistemas dinâmicos dynamical systems function approximation telecommunication networks
18	Aprendizado por reforço utilizando tile coding em cenários multiagente / Reinforcement learning using tile coding in multiagent scenarios Waskow, Samuel Justo January 2010 (has links) Atualmente pesquisadores de inteligência artificial buscam métodos para solucionar problemas de aprendizado por reforço que estão associados a uma grande quantidade de recursos computacionais. Em cenários multiagentes onde os espaços de estados e ações possuem alta dimensionalidade, as abordagens tradicionais de aprendizado por reforço são inadequadas. Como alternativa existem técnicas de generalização do espaço de estados que ampliam a capacidade de aprendizado através de abstrações. Desta maneira, o foco principal deste trabalho é utilizar as técnicas existentes de aprendizado por reforço com aproximação de funções através de tile coding para aplicação nos seguintes cenários: presa-predador, controle de tráfego veicular urbano e jogos de coordenação. Os resultados obtidos nos experimentos demonstram que a representação de estados por tile coding tem desempenho superior à representação tabular. / Nowadays, researchers are seeking methods to solve reinforcement learning (RL) problems in complex scenarios. RL is an efficient, widely used machine learning technique in single-agent problems. Regarding multiagent systems, in which the state space generally has high dimensionality, standard reinforcement learning approaches may not be adequate. As alternatives, it is possible to use techniques that generalize the state space to enhance the ability of the agents to learn through the use of abstraction. Thus, the focus of this work is to use an existing reinforcement learning technique, namely tile coding, that is a better form of state representation. This kind of method is key in scenarios where agents have a high number of states to explore. In the scenarios used to test and validate this approach, our experimental results indicate that the tile coding state representation outperforms the tabular one. Inteligência artificial Sistemas multiagentes Agentes inteligentes Artificial intelligence Multiagent systems Reinforcement learning Function approximation
19	Evaluating the lifting capacity in a mobile crane simulation Roysson, Simon January 2020 (has links) The work environment of a mobile crane is hazardous where accidents can cause serious injuries or death for workers and non-workers. Therefore, the risk for these accidents should be avoided when possible. One way to avoid the potential accidents is to use mobile crane simulations instead, which removes the risk. Because of this, simulations have been developed to train operators and plan future operations. Mobile crane simulations can also be used to perform research related to mobile cranes, but for the result to be applicable to real-world settings the simulation has to be realistic enough. Therefore, this thesis evaluated an aspect of realism which is the lifting capacity of a mobile crane. This was done by having an artificial neural network train on values from load charts of a real crane, that was then used to predict the lifting capacities based on the boom length and the load radius of the virtual crane. An experiment was conducted in the simulation that collected the predicted lifting capacities which was then compared to the lifting capacities in the load charts of a real crane. The results showed that the lifting capacities could be predicted with little to no deviation except for in a few cases. When conducting the experiment, it was found that the virtual mobile crane could not reach all load radiuses documented in the real load charts. The predicted lifting capacities are concluded to be realistic enough for crane-related research, but should be refined if the lifting capacity plays a key role in the research. Future works such as improving and generalizing the artificial network, and performing the evaluation with user tests are prompted. Realism Mobile Crane Crane Simulation Unity Artificial Neural Network Function Approximation Computer Sciences Datavetenskap (datalogi)
20	Optimal Sampling for Linear Function Approximation and High-Order Finite Difference Methods over Complex Regions January 2019 (has links) abstract: I focus on algorithms that generate good sampling points for function approximation. In 1D, it is well known that polynomial interpolation using equispaced points is unstable. On the other hand, using Chebyshev nodes provides both stable and highly accurate points for polynomial interpolation. In higher dimensional complex regions, optimal sampling points are not known explicitly. This work presents robust algorithms that find good sampling points in complex regions for polynomial interpolation, least-squares, and radial basis function (RBF) methods. The quality of these nodes is measured using the Lebesgue constant. I will also consider optimal sampling for constrained optimization, used to solve PDEs, where boundary conditions must be imposed. Furthermore, I extend the scope of the problem to include finding near-optimal sampling points for high-order finite difference methods. These high-order finite difference methods can be implemented using either piecewise polynomials or RBFs. / Dissertation/Thesis / Doctoral Dissertation Mathematics 2019 Applied mathematics Finite Difference Methods Function Approximation Lebesgue Constant Optimal Sampling

Search results