Spelling suggestions: "subject:"I.2.6 - 1earning"" "subject:"I.2.6 - c1earning""
1 |
The Role of Task and Environment in Biologically Inspired Artificial Intelligence: Learning as an Active, Sensorimotor ProcessClay, Viviane 22 April 2022 (has links)
The fields of biologically inspired artificial intelligence, neuroscience, and psychology have had exciting influences on each other over the past decades. Especially recently, with the increased popularity and success of artificial neural networks (ANNs), ANNs have enjoyed frequent use as models for brain function. However, there are still many disparities between the implementation, algorithms, and learning environment used for deep learning and those employed by the brain, which is reflected in their differing abilities. I first briefly introduce ANNs and survey the differences and similarities between them and the brain. I then make a case for designing the learning environment of ANNs to be more similar to that in which brains learn, namely by allowing them to actively interact with the world and decreasing the amount of external supervision. To implement this sensorimotor learning in an artificial agent, I use deep reinforcement learning, which I will also briefly introduce and compare to learning in the brain.
In the research presented in this dissertation, I focus on testing the hypothesis that the learning environment matters and that learning in an embodied way leads to acquiring different representations of the world. We first tested this on human subjects, comparing spatial knowledge acquisition in virtual reality to learning from an interactive map. The corresponding two publications are complemented by a methods paper describing eye tracking in virtual reality as a helpful tool in this type of research. After demonstrating that subjects do indeed learn different spatial knowledge in the two conditions, we test whether this transfers to artificial agents. Two further publications show that an ANN learning through interaction learns significantly different representations of the sensory input than ANNs that learn without interaction. We also demonstrate that through end-to-end sensorimotor learning, an ANN can learn visually-guided motor control and navigation behavior in a complex 3D maze environment without any external supervision using curiosity as an intrinsic reward signal. The learned representations are sparse, encode meaningful, action-oriented information about the environment, and can perform few-shot object recognition despite not knowing any labeled data beforehand. Overall, I make a case for increasing the realism of the computational tasks ANNs need to solve (largely self-supervised, sensorimotor learning) to improve some of their shortcomings and make them better models of the brain.
|
2 |
Reliable On-line Machine Learning for Regression Tasks in Presence of UncertaintiesBuschermöhle, Andreas 15 October 2014 (has links)
Machine learning plays an increasingly important role in modern systems. The ability to learn from data enhances or enables many applications. Recently, quick in-stream processing of possibly a huge or even infinite amount of data gains more attention. This thesis deals with such on-line learning systems for regression that learn with every example incrementally and are reliable even in presence of uncertainties. A new learning approach, called IRMA, is introduced which directly incorporates knowledge about the model structure into its parameter update. This way it is aggressive to incorporate a new example locally as much as possible and at the same time passive in the sense that the global output is changed as little as possible. It can be applied to any model structure that is linear in its parameters and is proven to minimize the worst case prediction error in each step. Hence, IRMA is reliable in every situation and the investigations show that in every case a bad performance is prevented by inherently averting overfitting even for complex model structures and in high dimensions. An extension of such on-line learning systems monitors the learning process, regarding conflict and ignorance, and estimates the trustworthiness of the learned hypothesis by the means of trust management. This provides insight into the learning system at every step and the designer can adjust its setup if necessary. Additionally, the trust estimation allows to assign a trustworthiness to each individual prediction the learning system makes. This way the overall system can react to uncertain predictions at a higher level and increase its safety, e.g. by reverting to a fallback. Furthermore, the uncertainties are explicitly incorporated into the learning process. The uncertainty of the hypothesis is reflected by allowing less change for more certain regions of the learned system. This way, good learned knowledge is protected and a higher robustness to disturbances is achieved. The uncertainty of each example used for learning is reflected by adapting less to uncertain examples. Thereby, the learning system gets more robust to training examples that are known to be uncertain. All approaches are formally analyzed and their characteristic properties are demonstrated in empirical investigations. In addition, a real world application to forecasting electricity loads shows the benefits of the approaches.
|
3 |
Algorithms for Scalable On-line Machine Learning on Regression TasksSchoenke, Jan H. 25 April 2019 (has links)
In the realm of ever increasing data volume and traffic the processing of data as a stream is key in order to build flexible and scalable data processing engines. On-line machine learning provides powerful algorithms for extracting predictive models from such data streams even if the modeled relation is time-variant in nature. The modeling of real valued data in on-line regression tasks is especially important as it connects to modeling and system identification tasks in engineering domains and bridges to other fields of machine learning like classification and reinforcement learning. Therefore, this thesis considers the problem of on-line regression on time variant data streams and introduces a new multi resolution perspective for tackling it.
The proposed incremental learning system, called AS-MRA, comprises a new interpolation scheme for symmetric simplicial input segmentations, a layered approximation structure of sequential local refinement layers and a learning architecture for efficiently training the layer structure. A key concept for making these components work together in harmony is a differential parameter encoding between subsequent refinement layers which allows to decompose the target function into independent additional components represented as individual refinement layers. The whole AS-MRA approach is designed to form a smooth approximation while having its computational demands scaling linearly towards the input dimension and the overall expressiveness and therefore potential storage demands scaling exponentially towards input dimension.
The AS-MRA provides no mandatory design parameters, but offers opportunities for the user to state tolerance parameters for the expected prediction performance which automatically and adaptively shape the resulting layer structure during the learning process. Other optional design parameters allow to restrict the resource consumption with respect to computational and memory demands. The effect of these parameters and the learning behavior of the AS-MRA as such are investigated with respect to various learning issues and compared to different related on-line learning approaches. The merits and contributions of the AS-MRA are experimentally shown and linked to general considerations about the relation between key concepts of the AS-MRA and fundamental results in machine learning.
|
4 |
Reinforcement Learning with History ListsTimmer, Stephan 13 March 2009 (has links)
A very general framework for modeling uncertainty in learning environments is given by Partially Observable Markov Decision Processes (POMDPs). In a POMDP setting, the learning agent infers a policy for acting optimally in all possible states of the environment, while receiving only observations of these states. The basic idea for coping with partial observability is to include memory into the representation of the policy. Perfect memory is provided by the belief space, i.e. the space of probability distributions over environmental states. However, computing policies defined on the belief space requires a considerable amount of prior knowledge about the learning problem and is expensive in terms of computation time. In this thesis, we present a reinforcement learning algorithm for solving deterministic POMDPs based on short-term memory. Short-term memory is implemented by sequences of past observations and actions which are called history lists. In contrast to belief states, history lists are not capable of representing optimal policies, but are far more practical and require no prior knowledge about the learning problem. The algorithm presented learns policies consisting of two separate phases. During the first phase, the learning agent collects information by actively establishing a history list identifying the current state. This phase is called the efficient identification strategy. After the current state has been determined, the Q-Learning algorithm is used to learn a near optimal policy. We show that such a procedure can be also used to solve large Markov Decision Processes (MDPs). Solving MDPs with continuous, multi-dimensional state spaces requires some form of abstraction over states. One particular way of establishing such abstraction is to ignore the original state information, only considering features of states. This form of state abstraction is closely related to POMDPs, since features of states can be interpreted as observations of states.
|
5 |
Dateneffiziente selbstlernende neuronale ReglerHafner, Roland 04 December 2009 (has links)
Die vorliegende Arbeit untersucht den Entwurf und die Anwendung selbstlernender Regler als intelligente Reglerkomponente im Wirkungsablauf eines Regelkreises für regelungstechnische Anwendungen. Der aufwändige Prozess der Analyse des dynamischen Systems und der Reglersynthese, welche die klassischen Entwurfsmuster der Regelungstechnik benötigen, wird dabei ersetzt durch eine lernende Reglerkomponente. Diese kann mit sehr wenig Wissen über den zu regelnden Prozess eingesetzt werden und lernt direkt durch Interaktion eine präzise Regelung auf extern vorgegebene Führungsgrößen. Der Lernvorgang basiert dabei auf einem Optimierungsprozess mit einem leistungsfähigen Batch-Reinforcement-Lernverfahren, dem ´Neural Fitted Q-Iteration´. Dieses Verfahren wird auf seine Verwendung als selbstlernender Regler untersucht. Für die in den Untersuchungen festgestellten Unzulänglichkeiten des Verfahrens bezüglich der geforderten präzisen, zeitoptimalen Regelung werden verbesserte Vorgehensweisen entwickelt, die ebenfalls auf ihre Leistungsfähigkeit untersucht werden.Für typische regelungstechnische Problemstellungen sind die diskreten Aktionen des NFQ-Verfahrens nicht ausreichend, um eine präzise Regelung auf beliebige Führungsgrößen zu erzeugen.Durch die Entwicklung einer Erweiterung des NFQ für kontinuierliche Aktionen wird die Genauigkeit und Leistungsfähigkeit der selbstlernenden Regler drastisch erhöht, ohne die benötigte Interaktionszeit am Prozess zu erhöhen.An ausgewählten Problemen der Regelung linearer und nichtlinearer Prozesse wird die Leistungsfähigkeit der entwickelten Verfahren empirisch evaluiert. Es zeigt sich dabei, dass die hier entwickelten selbstlernenden Regler mit wenigen Minuten Interaktionszeit an einem Prozess eine präzise Regelungsstrategie für beliebige externe Führungsgrößen lernen, ohne dass Expertenwissen über den Prozess vorliegt.
|
6 |
Inducing Conceptual User ModelsMüller, Martin Eric 29 April 2002 (has links)
User Modeling and Machine Learning for User Modeling have both become important research topics and key techniques in recent adaptive systems. One of the most intriguing problems in the `information age´ is how to filter relevant information from the huge amount of available data. This problem is tackled by using models of the user´s interest in order to increase precision and discriminate interesting information from un-interesting data. However, any user modeling approach suffers from several major drawbacks: User models built by the system need to be inspectable and understandable by the user himself. Secondly, users in general are not willing to give feedback concerning user satisfaction by the delivered results. Without any evidence for the user´s interest, it is hard to induce a hypothetical user model at all. Finally, most current systems do not draw a line of distinction between domain knowledge and user model which makes the adequacy of a user model hard to determine. This thesis presents the novel approach of conceptual user models. Conceptual user models are easy to inspect and understand and allow for the system to explain its actions to the user. It is shown, that ILP can be applied for the task of inducing user models from feedback, and a method for using mutual feedback for sample enlargement is introduced. Results are evaluated independently of domain knowledge within a clear machine learning problem definition. The whole concept presented is realized in a meta web search engine called OySTER.
|
Page generated in 0.0647 seconds