Spelling suggestions: "subject:"model:learning"" "subject:"models:earnings""
1 |
Aprendizado por reforço em ambientes não-estacionáriosSilva, Bruno Castro da January 2007 (has links)
Neste trabalho apresentamos o RL-CD (Reinforcement Learning with Context Detection), um método desenvolvido a fim de lidar com o problema do aprendizado por reforço (RL) em ambientes não-estacionários. Embora os métodos existentes de RL consigam, muitas vezes, superar a não-estacionariedade, o fazem sob o inconveniente de terem de reaprender políticas que já haviam sido calculadas, o que implica perda de desempenho durante os períodos de readaptação. O método proposto baseia-se em um mecanismo geral através do qual são criados, atualizados e selecionados um dentre vários modelos e políticas parciais. Os modelos parciais do ambiente são incrementalmente construídos de acordo com a capacidade do sistema de fazer predições eficazes. A determinação de tal medida de eficácia baseia-se no cálculo de qualidades globais para cada modelo, as quais refletem o ajuste total necessário para tornar cada modelo coerente com as experimentações reais. Depois de apresentadas as bases teóricas necessárias para fundamentar o RL-CD e suas equações, são propostos e discutidos um conjunto de experimentos que demonstram sua eficiência, tanto em relação a estratégias clássicas de RL quanto em comparação a algoritmos especialmente projetados para lidar com cenários não-estacionários. O RL-CD é comparado com métodos reconhecidos na área de aprendizado por reforço e também com estratégias RL multi-modelo. Os resultados obtidos sugerem que o RLCD constitui uma abordagem eficiente para lidar com uma subclasse de ambientes nãoestacionários, especificamente aquela formada por ambientes cuja dinâmica é corretamente representada por um conjunto finito de Modelos de Markov estacionários. Por fim, apresentamos a análise teórica de um dos parâmetros mais importantes do RL-CD, possibilitada pela aproximação empírica de distribuições de probabilidades via métodos de Monte Carlo. Essa análise permite que os valores ideais de tal parâmetro sejam calculados, tornando assim seu ajuste independente da aplicação específica sendo estudada. / In this work we introduce RL-CD (Reinforcement Learning with Context Detection), a novel method for solving reinforcement learning (RL) problems in non-stationary environments. In face of non-stationary scenarios, standard RL methods need to continually readapt themselves to the changing dynamics of the environment. This causes a performance drop during the readjustment phase and implies the need for relearning policies even for dynamics which have already been experienced. RL-CD overcomes these problems by implementing a mechanism for creating, updating and selecting one among several partial models of the environment. The partial models are incrementally built according to the system’s capability of making predictions regarding a given sequence of observations. First, we present the motivations and the theorical basis needed to develop the conceptual framework of RL-CD. Afterwards, we propose, formalize and show the efficiency of RL-CD both in a simple non-stationary environment and in a noisy scenarios. We show that RL-CD performs better than two standard reinforcement learning algorithms and that it has advantages over methods specifically designed to cope with non-stationarity. Finally, we present the theoretical examination of one of RL-CD’s most important parameters, made possible by means of the analysis of probability distributions obtained via Monte Carlo methods. This analysis makes it possible for us to calculate the optimum values for this parameter, so that its adjustment can be performed independently of the scenario being studied.
|
2 |
Exploiting variable impedance in domains with contactsRadulescu, Andreea January 2016 (has links)
The control of complex robotic platforms is a challenging task, especially in designs with high levels of kinematic redundancy. Novel variable impedance actuators (VIAs) have recently demonstrated that, by allowing the ability to simultaneously modulate the output torque and impedance, one can achieve energetically more efficient and safer behaviour. However, this adds further levels of actuation redundancy, making planning and control of such systems even more complicated. VIAs are designed with the ability to mechanically modulate impedance during movement. Recent work from our group, employing the optimal control (OC) formulation to generate impedance policies, has shown the potential benefit of VIAs in tasks requiring energy storage, natural dynamic exploitation and robustness against perturbation. These approaches were, however, restricted to systems with smooth, continuous dynamics, performing tasks over a predefined time horizon. When considering tasks involving multiple phases of movement, including switching dynamics with discrete state transitions (resulting from interactions with the environment), traditional approaches such as independent phase optimisation would result in a potentially suboptimal behaviour. Our work addresses these issues by extending the OC formulation to a multiphase scenario and incorporating temporal optimisation capabilities (for robotic systems with VIAs). Given a predefined switching sequence, the developed methodology computes the optimal torque and impedance profile, alongside the optimal switching times and total movement duration. The resultant solution minimises the control effort by exploiting the actuation redundancy and modulating the natural dynamics of the system to match those of the desired movement. We use a monopod hopper and a brachiation system in numerical simulations and a hardware implementation of the latter to demonstrate the effectiveness and robustness of our approach on a variety of dynamic tasks. The performance of model-based control relies on the accuracy of the dynamics model. This can deteriorate significantly due to elements that cannot be fully captured by analytic dynamics functions and/or due to changes in the dynamics. To circumvent these issues, we improve the performance of the developed framework by incorporating an adaptive learning algorithm. This performs continuous data-driven adjustments to the dynamics model while re-planning optimal policies that reflect this adaptation. The results presented show that the augmented approach is able to handle a range of model discrepancies, in both simulation and hardware experiments using the developed robotic brachiation system.
|
3 |
Aprendizado por reforço em ambientes não-estacionáriosSilva, Bruno Castro da January 2007 (has links)
Neste trabalho apresentamos o RL-CD (Reinforcement Learning with Context Detection), um método desenvolvido a fim de lidar com o problema do aprendizado por reforço (RL) em ambientes não-estacionários. Embora os métodos existentes de RL consigam, muitas vezes, superar a não-estacionariedade, o fazem sob o inconveniente de terem de reaprender políticas que já haviam sido calculadas, o que implica perda de desempenho durante os períodos de readaptação. O método proposto baseia-se em um mecanismo geral através do qual são criados, atualizados e selecionados um dentre vários modelos e políticas parciais. Os modelos parciais do ambiente são incrementalmente construídos de acordo com a capacidade do sistema de fazer predições eficazes. A determinação de tal medida de eficácia baseia-se no cálculo de qualidades globais para cada modelo, as quais refletem o ajuste total necessário para tornar cada modelo coerente com as experimentações reais. Depois de apresentadas as bases teóricas necessárias para fundamentar o RL-CD e suas equações, são propostos e discutidos um conjunto de experimentos que demonstram sua eficiência, tanto em relação a estratégias clássicas de RL quanto em comparação a algoritmos especialmente projetados para lidar com cenários não-estacionários. O RL-CD é comparado com métodos reconhecidos na área de aprendizado por reforço e também com estratégias RL multi-modelo. Os resultados obtidos sugerem que o RLCD constitui uma abordagem eficiente para lidar com uma subclasse de ambientes nãoestacionários, especificamente aquela formada por ambientes cuja dinâmica é corretamente representada por um conjunto finito de Modelos de Markov estacionários. Por fim, apresentamos a análise teórica de um dos parâmetros mais importantes do RL-CD, possibilitada pela aproximação empírica de distribuições de probabilidades via métodos de Monte Carlo. Essa análise permite que os valores ideais de tal parâmetro sejam calculados, tornando assim seu ajuste independente da aplicação específica sendo estudada. / In this work we introduce RL-CD (Reinforcement Learning with Context Detection), a novel method for solving reinforcement learning (RL) problems in non-stationary environments. In face of non-stationary scenarios, standard RL methods need to continually readapt themselves to the changing dynamics of the environment. This causes a performance drop during the readjustment phase and implies the need for relearning policies even for dynamics which have already been experienced. RL-CD overcomes these problems by implementing a mechanism for creating, updating and selecting one among several partial models of the environment. The partial models are incrementally built according to the system’s capability of making predictions regarding a given sequence of observations. First, we present the motivations and the theorical basis needed to develop the conceptual framework of RL-CD. Afterwards, we propose, formalize and show the efficiency of RL-CD both in a simple non-stationary environment and in a noisy scenarios. We show that RL-CD performs better than two standard reinforcement learning algorithms and that it has advantages over methods specifically designed to cope with non-stationarity. Finally, we present the theoretical examination of one of RL-CD’s most important parameters, made possible by means of the analysis of probability distributions obtained via Monte Carlo methods. This analysis makes it possible for us to calculate the optimum values for this parameter, so that its adjustment can be performed independently of the scenario being studied.
|
4 |
Aprendizado por reforço em ambientes não-estacionáriosSilva, Bruno Castro da January 2007 (has links)
Neste trabalho apresentamos o RL-CD (Reinforcement Learning with Context Detection), um método desenvolvido a fim de lidar com o problema do aprendizado por reforço (RL) em ambientes não-estacionários. Embora os métodos existentes de RL consigam, muitas vezes, superar a não-estacionariedade, o fazem sob o inconveniente de terem de reaprender políticas que já haviam sido calculadas, o que implica perda de desempenho durante os períodos de readaptação. O método proposto baseia-se em um mecanismo geral através do qual são criados, atualizados e selecionados um dentre vários modelos e políticas parciais. Os modelos parciais do ambiente são incrementalmente construídos de acordo com a capacidade do sistema de fazer predições eficazes. A determinação de tal medida de eficácia baseia-se no cálculo de qualidades globais para cada modelo, as quais refletem o ajuste total necessário para tornar cada modelo coerente com as experimentações reais. Depois de apresentadas as bases teóricas necessárias para fundamentar o RL-CD e suas equações, são propostos e discutidos um conjunto de experimentos que demonstram sua eficiência, tanto em relação a estratégias clássicas de RL quanto em comparação a algoritmos especialmente projetados para lidar com cenários não-estacionários. O RL-CD é comparado com métodos reconhecidos na área de aprendizado por reforço e também com estratégias RL multi-modelo. Os resultados obtidos sugerem que o RLCD constitui uma abordagem eficiente para lidar com uma subclasse de ambientes nãoestacionários, especificamente aquela formada por ambientes cuja dinâmica é corretamente representada por um conjunto finito de Modelos de Markov estacionários. Por fim, apresentamos a análise teórica de um dos parâmetros mais importantes do RL-CD, possibilitada pela aproximação empírica de distribuições de probabilidades via métodos de Monte Carlo. Essa análise permite que os valores ideais de tal parâmetro sejam calculados, tornando assim seu ajuste independente da aplicação específica sendo estudada. / In this work we introduce RL-CD (Reinforcement Learning with Context Detection), a novel method for solving reinforcement learning (RL) problems in non-stationary environments. In face of non-stationary scenarios, standard RL methods need to continually readapt themselves to the changing dynamics of the environment. This causes a performance drop during the readjustment phase and implies the need for relearning policies even for dynamics which have already been experienced. RL-CD overcomes these problems by implementing a mechanism for creating, updating and selecting one among several partial models of the environment. The partial models are incrementally built according to the system’s capability of making predictions regarding a given sequence of observations. First, we present the motivations and the theorical basis needed to develop the conceptual framework of RL-CD. Afterwards, we propose, formalize and show the efficiency of RL-CD both in a simple non-stationary environment and in a noisy scenarios. We show that RL-CD performs better than two standard reinforcement learning algorithms and that it has advantages over methods specifically designed to cope with non-stationarity. Finally, we present the theoretical examination of one of RL-CD’s most important parameters, made possible by means of the analysis of probability distributions obtained via Monte Carlo methods. This analysis makes it possible for us to calculate the optimum values for this parameter, so that its adjustment can be performed independently of the scenario being studied.
|
5 |
Enhancing the Verification-Driven Learning Model for Data Structures with VisualizationKondeti, Yashwanth Reddy 04 August 2011 (has links)
The thesis aims at teaching various data structures algorithms using the Visualization Learning tool. The main objective of the work is to provide a learning opportunity for novice computer science students to gain a broader exposure towards data structure programming. The visualization learning tool is based on the Verification-Driven Learning model developed for software engineering. The tool serves as a platform for demonstrating visualizations of various data structures algorithms. All the visualizations are designed to emphasize the important operational features of various data structures. The learning tool encourages students into learning data structures by designing Learning Cases. The Learning Cases have been carefully designed to systematically implant bugs in a properly functioning visualization. Students are assigned the task of analyzing the code and also identify the bugs through quizzing. This provides students with a challenging hands-on learning experience that complements students’ textbook knowledge. It also serves as a significant foundation for pursuing future courses in data structures.
|
6 |
Cross-class transfer learning for visual dataKodirov, Elyor January 2017 (has links)
Automatic analysis of visual data is a key objective of computer vision research; and performing visual recognition of objects from images is one of the most important steps towards understanding and gaining insights into the visual data. Most existing approaches in the literature for the visual recognition are based on a supervised learning paradigm. Unfortunately, they require a large amount of labelled training data which severely limits their scalability. On the other hand, recognition is instantaneous and effortless for humans. They can recognise a new object without seeing any visual samples by just knowing the description of it, leveraging similarities between the description of the new object and previously learned concepts. Motivated by humans recognition ability, this thesis proposes novel approaches to tackle cross-class transfer learning (crossclass recognition) problem whose goal is to learn a model from seen classes (those with labelled training samples) that can generalise to unseen classes (those with labelled testing samples) without any training data i.e., seen and unseen classes are disjoint. Specifically, the thesis studies and develops new methods for addressing three variants of the cross-class transfer learning: Chapter 3 The first variant is transductive cross-class transfer learning, meaning labelled training set and unlabelled test set are available for model learning. Considering training set as the source domain and test set as the target domain, a typical cross-class transfer learning assumes that the source and target domains share a common semantic space, where visual feature vector extracted from an image can be embedded using an embedding function. Existing approaches learn this function from the source domain and apply it without adaptation to the target one. They are therefore prone to the domain shift problem i.e., the embedding function is only concerned with predicting the training seen class semantic representation in the learning stage during learning, when applied to the test data it may underperform. In this thesis, a novel cross-class transfer learning (CCTL) method is proposed based on unsupervised domain adaptation. Specifically, a novel regularised dictionary learning framework is formulated by which the target class labels are used to regularise the learned target domain embeddings thus effectively overcoming the projection domain shift problem. Chapter 4 The second variant is inductive cross-class transfer learning, that is, only training set is assumed to be available during model learning, resulting in a harder challenge compared to the previous one. Nevertheless, this setting reflects a real-world setting in which test data is available after the model learning. The main problem remains the same as the previous variant, that is, the domain shift problem occurs when the model learned only from the training set is applied to the test set without adaptation. In this thesis, a semantic autoencoder (SAE) is proposed building on an encoder-decoder paradigm. Specifically, first a semantic space is defined so that knowledge transfer is possible from the seen classes to the unseen classes. Then, an encoder aims to embed/project a visual feature vector into the semantic space. However, the decoder exerts a generative task, that is, the projection must be able to reconstruct the original visual features. The generative task forces the encoder to preserve richer information, thus the learned encoder from seen classes is able generalise better to the new unseen classes. Chapter 5 The third one is unsupervised cross-class transfer learning. In this variant, no supervision is available for model learning i.e., only unlabelled training data is available, leading to the hardest setting compared to the previous cases. The goal, however, is the same, learning some knowledge from the training data that can be transferred to the test data composed of completely different labels from that of training data. The thesis proposes a novel approach which requires no labelled training data yet is able to capture discriminative information. The proposed model is based on a new graph regularised dictionary learning algorithm. By introducing a l1- norm graph regularisation term, instead of the conventional squared l2-norm, the model is robust against outliers and noises typical in visual data. Importantly, the graph and representation are learned jointly, resulting in further alleviation of the effects of data outliers. As an application, person re-identification is considered for this variant in this thesis.
|
7 |
Structural Model Discovery in Temporal Event Data StreamsMiller, Chreston 23 April 2013 (has links)
This dissertation presents a unique approach to human behavior analysis based on expert guidance and intervention through interactive construction and modification of behavior models. Our focus is to introduce the research area of behavior analysis, the challenges faced by this field, current approaches available, and present a new analysis approach: Interactive Relevance Search and Modeling (IRSM).
More intelligent ways of conducting data analysis have been explored in recent years. Ma- chine learning and data mining systems that utilize pattern classification and discovery in non-textual data promise to bring new generations of powerful "crawlers" for knowledge discovery, e.g., face detection and crowd surveillance. Many aspects of data can be captured by such systems, e.g., temporal information, extractable visual information - color, contrast, shape, etc. However, these captured aspects may not uncover all salient information in the data or provide adequate models/patterns of phenomena of interest. This is a challenging problem for social scientists who are trying to identify high-level, conceptual patterns of human behavior from observational data (e.g., media streams).
The presented research addresses how social scientists may derive patterns of human behavior captured in media streams. Currently, media streams are being segmented into sequences of events describing the actions captured in the streams, such as the interactions among humans. This segmentation creates a challenging data space to search characterized by non- numerical, temporal, descriptive data, e.g., Person A walks up to Person B at time T. This dissertation will present an approach that allows one to interactively search, identify, and discover temporal behavior patterns within such a data space.
Therefore, this research addresses supporting exploration and discovery in behavior analysis through a formalized method of assisted exploration. The model evolution presented sup- ports the refining of the observer\'s behavior models into representations of their understanding. The benefit of the new approach is shown through experimentation on its identification accuracy and working with fellow researchers to verify the approach\'s legitimacy in analysis of their data. / Ph. D.
|
8 |
Self-organisation of internal models in autonomous robotsSmith Bize, Simon Cristobal January 2016 (has links)
Internal Models (IMs) play a significant role in autonomous robotics. They are mechanisms able to represent the input-output characteristics of the sensorimotor loop. In developmental robotics, open-ended learning of skills and knowledge serves the purpose of reaction to unexpected inputs, to explore the environment and to acquire new behaviours. The development of the robot includes self-exploration of the state-action space and learning of the environmental dynamics. In this dissertation, we explore the properties and benefits of the self-organisation of robot behaviour based on the homeokinetic learning paradigm. A homeokinetic robot explores the environment in a coherent way without prior knowledge of its configuration or the environment itself. First, we propose a novel approach to self-organisation of behaviour by artificial curiosity in the sensorimotor loop. Second, we study how different forward models settings alter the behaviour of both exploratory and goal-oriented robots. Diverse complexity, size and learning rules are compared to assess the importance in the robot’s exploratory behaviour. We define the self-organised behaviour performance in terms of simultaneous environment coverage and best prediction of future sensori inputs. Among the findings, we have encountered that models with a fast response and a minimisation of the prediction error by local gradients achieve the best performance. Third, we study how self-organisation of behaviour can be exploited to learn IMs for goal-oriented tasks. An IM acquires coherent self-organised behaviours that are then used to achieve high-level goals by reinforcement learning (RL). Our results demonstrate that learning of an inverse model in this context yields faster reward maximisation and a higher final reward. We show that an initial exploration of the environment in a goal-less yet coherent way improves learning. In the same context, we analyse the self-organisation of central pattern generators (CPG) by reward maximisation. Our results show that CPGs can learn favourable reward behaviour on high-dimensional robots using the self-organised interaction between degrees of freedom. Finally, we examine an on-line dual control architecture where we combine an Actor-Critic RL and the homeokinetic controller. With this configuration, the probing signal is generated by the exertion of the embodied robot experience with the environment. This set-up solves the problem of designing task-dependant probing signals by the emergence of intrinsically motivated comprehensible behaviour. Faster improvement of the reward signal compared to classic RL is achievable with this configuration.
|
9 |
Achieving Practical Functional Electrical Stimulation-Driven Reaching Motions in an Individual with TetraplegiaWolf, Derek N. 10 December 2020 (has links)
No description available.
|
10 |
Didattica e teoria dell'apprendimento nelle opere di Renzo TitoneGUARDINCERRI, PATRIZIA 05 March 2012 (has links)
La presente ricerca desidera investigare il pensiero didattico del professor Renzo Titone, studioso di levatura internazionale che operò più assiduamente negli Stati Uniti e in Italia.
Il discorso maggiormente noto di Titone si articola nel vasto campo della linguistica, intesa nell’accezione più ampia che è possibile attribuirle. Ma l’originalità del nostro trascende questa disciplina, pur applicandosi ad essa, in quanto egli cercò un modello che potesse funzionare universalmente nel processo di insegnamento-apprendimento, un modello universale che spiegasse il processo di apprendimento di ogni individuo e fondasse così il procedere dell’insegnamento, un modello matetico che fondasse la didattica, un modello sia analitico di tipo diagnostico sia operativo applicabile alla didassi. / The objective of this research is to investigate Renzo Titone’s didactic theory. Titone was a famous international professor for a long time, from 1960 to 2000.
One of Titone’s foundamental points, that this research would like to illuminate, is a general didactic theme; actually at present Titone is more famous for his psycholinguistic studies, but he worked a lot in the field of general didactic and education.
The results could illuminate a man who innovated the didactic view in the past and who can also innovate it in the present.
|
Page generated in 0.0613 seconds