Global ETD Search

1	Uma arquitetura híbrida aplicada em problemas de aprendizagem por reforço / A hybrid architecture to address reinforcement learning problems Arruda, Rodrigo Lopes Setti de 02 July 2012 (has links) Orientador: Fernando José Von Zuben / Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de Computação / Made available in DSpace on 2018-08-20T00:09:41Z (GMT). No. of bitstreams: 1 Arruda_RodrigoLopesSettide_M.pdf: 2295891 bytes, checksum: 4f5f4bc8f219b0c3c27239520027d496 (MD5) Previous issue date: 2012 / Resumo: Com o uso de sistemas cognitivos em uma crescente gama de aplicações, criou-se uma grande expectativa e elevada demanda por máquinas cada vez mais autônomas, inteligentes e criativas na solução de problemas reais. Em diversos casos, os desafios demandam capacidade de aprendizado e adaptação. Este trabalho lida com conceitos de aprendizagem por reforço e discorre sobre as principais abordagens de solução e variações de problemas. Em seguida, constrói uma proposta híbrida incorporando outras ideias em aprendizagem de máquina, validando-a com experimentos simulados. Os experimentos permitem apontar as principais vantagens da metodologia proposta, a qual está fundamentada em sua capacidade de lidar com cenários de espaços contínuos e, também, de aprender uma política ótima enquanto segue outra, exploratória. A arquitetura proposta é híbrida, baseada em uma rede neural perceptron multi-camadas acoplada a um aproximador de funções denominado wirefitting. Esta arquitetura é coordenada por um algoritmo adaptativo e dinâmico que une conceitos de programação dinâmica, análise de Monte Carlo, aprendizado por diferença temporal e elegibilidade. O modelo proposto é utilizado para resolver problemas de controle ótimo, por meio de aprendizagem por reforço, em cenários com variáveis contínuas e desenvolvimento não-linear. Duas instâncias diferentes de problemas de controle, reconhecidas na literatura pertinente, são apresentadas e testadas com a mesma arquitetura / Abstract: With the evergrowing use of cognitive systems in various applications, it has been created a high expectation and a large demand for machines more and more autonomous, intelligent and creative in real world problem solving. In several cases, the challenges ask for high adaptive and learning capability. This work deals with the concepts of reinforcement learning, and reasons on the main solution approaches and problem variations. Subsequently, it builds a hybrid proposal incorporating other machine learning ideas, so that the proposal is validated with simulated experiments. The experiments allow to point out the main advantages of the proposed methodology, founded on its capability to handle continuous space environments, and also to learn an optimal policy while following an exploratory policy. The proposed architecture is hybrid in the sense that it is based on a multi-layer perceptron neural network coupled with a function approximator called wire-fitting. The referred architecture is coordinated by a dynamic and adaptive algorithm which merges concepts from dynamic programming, Monte Carlo analysis, temporal difference learning, and eligibility. The proposed model is used to solve optimal control problems, by means of reinforcement learning, in scenarios endowed with continuous variables and nonlinear development. Two different instances of control problems, well discussed in the pertinent literature, are presented and tested with the same architecture / Mestrado / Engenharia de Computação / Mestre em Engenharia Elétrica Inteligência artificial Aprendizado de máquina Teoria dos autômatos Robótica Robôs móveis Artificial intelligence Machine learning Theory of automata Robotics Mobile robots
2	Multioperator Weighted Monadic Datalog Stüber, Torsten 10 February 2011 (has links) In this thesis we will introduce multioperator weighted monadic datalog (mwmd), a formal model for specifying tree series, tree transformations, and tree languages. This model combines aspects of multioperator weighted tree automata (wmta), weighted monadic datalog (wmd), and monadic datalog tree transducers (mdtt). In order to develop a rich theory we will define multiple versions of semantics for mwmd and compare their expressiveness. We will study normal forms and decidability results of mwmd and show (by employing particular semantic domains) that the theory of mwmd subsumes the theory of both wmd and mdtt. We conclude this thesis by showing that mwmd even contain wmta as a syntactic subclass and present results concerning this subclass. info:eu-repo/classification/ddc/004 ddc:004
3	Strukturované multisystémy a multiautomaty indukované časovými procesy / Structured Multisystems and Multiautomata Induced by Times Processes Křehlík, Štěpán January 2015 (has links) In the thesis we discuss binary hyperstructures of linear differential operators of the second order both in general and (inspired by models of specific time processes) in a special case of the Jacobi form. We also study binary hyperstructures constructed from distributive lattices and suggest transfer of this construction to n-ary hyperstructures. We use these hyperstructures to construct multiautomata and quasi-multiautomata. The input sets of all these automata structures are constructed so that the transfer of information for certain specific modeling time functions is facilitated. For this reason we use smooth positive functions or vectors components of which are real numbers or smooth positive functions. The above hyperstructures are state-sets of these automata structures. Finally, we investigate various types of compositions of the above multiautomata and quasi-multiautomata. In order to this we have to generalize the classical definitions of Dörfler. While some of the concepts can be transferred to the hyperstructure context rather easily, in the case of Cartesian composition the attempt to generalize it leads to some interesting results.

1

Page generated in 0.0918 seconds