Global ETD Search

331	Serotonergic modulation of cognition Skandali, Nikolina January 2018 (has links) Action control arises from the interaction of two anatomically distinct decision-making systems, namely goal-directed and habitual behaviour. Goal-directed behaviour is characterized by the consideration of future choices and respective outcomes whereas habitual responding is driven by stimulus-response associations. Response inhibition is essential for goal-directed behaviour and deficits are shown in impulsivity. We administered an acute clinically relevant dosage of the commonly used serotonin reuptake inhibitor escitalopram to sixty-six healthy volunteers in a double-blind, randomized, placebo-controlled design. We administered a large task battery in order to study the effect of escitalopram in several cognitive functions including response inhibition, learning and affective processing. We found dissociate effects on cognitive aspects possibly mediated by distinct cortico-striatal loops. Acute escitalopram administration had a beneficial effect on action cancellation, one aspect of inhibitory control, without any effect on action restraint or waiting impulsivity. The treatment resulted in impaired performance in a probabilistic reversal-learning task and increased sensitivity to misleading feedback thus leading to maladaptive performance. An extra-dimensional set shift impairment during an attention set shift task and a tendency towards impaired instrumental learning discrimination were also observed in the escitalopram group. Our results are discussed in the context of well-documented effects of the dopaminergic system and suggestions of opponent interaction of serotonin and dopamine.
332	Formal methods paradigms for estimation and machine learning in dynamical systems Jones, Austin 08 April 2016 (has links) Formal methods are widely used in engineering to determine whether a system exhibits a certain property (verification) or to design controllers that are guaranteed to drive the system to achieve a certain property (synthesis). Most existing techniques require a large amount of accurate information about the system in order to be successful. The methods presented in this work can operate with significantly less prior information. In the domain of formal synthesis for robotics, the assumptions of perfect sensing and perfect knowledge of system dynamics are unrealistic. To address this issue, we present control algorithms that use active estimation and reinforcement learning to mitigate the effects of uncertainty. In the domain of cyber-physical system analysis, we relax the assumption that the system model is known and identify system properties automatically from execution data. First, we address the problem of planning the path of a robot under temporal logic constraints (e.g. "avoid obstacles and periodically visit a recharging station") while simultaneously minimizing the uncertainty about the state of an unknown feature of the environment (e.g. locations of fires after a natural disaster). We present synthesis algorithms and evaluate them via simulation and experiments with aerial robots. Second, we develop a new specification language for tasks that require gathering information about and interacting with a partially observable environment, e.g. "Maintain localization error below a certain level while also avoiding obstacles.'' Third, we consider learning temporal logic properties of a dynamical system from a finite set of system outputs. For example, given maritime surveillance data we wish to find the specification that corresponds only to those vessels that are deemed law-abiding. Algorithms for performing off-line supervised and unsupervised learning and on-line supervised learning are presented. Finally, we consider the case in which we want to steer a system with unknown dynamics to satisfy a given temporal logic specification. We present a novel reinforcement learning paradigm to solve this problem. Our procedure gives "partial credit'' for executions that almost satisfy the specification, which can lead to faster convergence rates and produce better solutions when the specification is not satisfiable. Systems science Active estimation Formal methods Machine learning Reinforcement learning Temporal logic
333	Extension on Adaptive MAC Protocol for Space Communications Li, Max Hongming 06 December 2018 (has links) This work devises a novel approach for mitigating the effects of Catastrophic Forgetting in Deep Reinforcement Learning-based cognitive radio engine implementations employed in space communication applications. Previous implementations of cognitive radio space communication systems utilized a moving window- based online learning method, which discards part of its understanding of the environment each time the window is moved. This act of discarding is called Catastrophic Forgetting. This work investigated ways to control the forgetting process in a more systematic manner, both through a recursive training technique that implements forgetting in a more controlled manner and an ensemble learning technique where each member of the ensemble represents the engine's understanding over a certain period of time. Both of these techniques were integrated into a cognitive radio engine proof-of-concept, and were delivered to the SDR platform on the International Space Station. The results were then compared to the results from the original proof-of-concept. Through comparison, the ensemble learning technique showed promise when comparing performance between training techniques during different communication channel contexts. Catastrophic Forgetting Deep Reinforcement Learning DVB-S2 Machine Learning NASA Space Communications
334	Machine learning for materials science Rouet-Leduc, Bertrand January 2017 (has links) Machine learning is a branch of artificial intelligence that uses data to automatically build inferences and models designed to generalise and make predictions. In this thesis, the use of machine learning in materials science is explored, for two different problems: the optimisation of gallium nitride optoelectronic devices, and the prediction of material failure in the setting of laboratory earthquakes. Light emitting diodes based on III-nitrides quantum wells have become ubiquitous as a light source, owing to their direct band-gap that covers UV, visible and infra-red light, and their very high quantum efficiency. This efficiency originates from most electronic transitions across the band-gap leading to the emission of a photon. At high currents however this efficiency sharply drops. In chapters 3 and 4 simulations are shown to provide an explanation for experimental results, shedding a new light on this drop of efficiency. Chapter 3 provides a simple and yet accurate model that explains the experimentally observed beneficial effect that silicon doping has on light emitting diodes. Chapter 4 provides a model for the experimentally observed detrimental effect that certain V-shaped defects have on light emitting diodes. These results pave the way for the association of simulations to detailed multi-microscopy. In the following chapters 5 to 7, it is shown that machine learning can leverage the use of device simulations, by replacing in a targeted and efficient way the very labour intensive tasks of making sure the numerical parameters of the simulations lead to convergence, and that the physical parameters reproduce experimental results. It is then shown that machine learning coupled with simulations can find optimal light emitting diodes structures, that have a greatly enhanced theoretical efficiency. These results demonstrate the power of machine learning for leveraging and automatising the exploration of device structures in simulations. Material failure is a very broad problem encountered in a variety of fields, ranging from engineering to Earth sciences. The phenomenon stems from complex and multi-scale physics, and failure experiments can provide a wealth of data that can be exploited by machine learning. In chapter 8 it is shown that by recording the acoustic waves emitted during the failure of a laboratory fault, an accurate predictive model can be built. The machine learning algorithm that is used retains the link with the physics of the experiment, and a new signal is thus discovered in the sound emitted by the fault. This new signal announces an upcoming laboratory earthquake, and is a signature of the stress state of the material. These results show that machine learning can help discover new signals in experiments where the amount of data is very large, and demonstrate a new method for the prediction of material failure. 620.1
335	Continuous reinforcement learning with incremental Gaussian mixture models / Aprendizagem por reforço contínua com modelos de mistura gaussianas incrementais Pinto, Rafael Coimbra January 2017 (has links) A contribução original desta tese é um novo algoritmo que integra um aproximador de funções com alta eficiência amostral com aprendizagem por reforço em espaços de estados contínuos. A pesquisa completa inclui o desenvolvimento de um algoritmo online e incremental capaz de aprender por meio de uma única passada sobre os dados. Este algoritmo, chamado de Fast Incremental Gaussian Mixture Network (FIGMN) foi empregado como um aproximador de funções eficiente para o espaço de estados de tarefas contínuas de aprendizagem por reforço, que, combinado com Q-learning linear, resulta em performance competitiva. Então, este mesmo aproximador de funções foi empregado para modelar o espaço conjunto de estados e valores Q, todos em uma única FIGMN, resultando em um algoritmo conciso e com alta eficiência amostral, i.e., um algoritmo de aprendizagem por reforço capaz de aprender por meio de pouquíssimas interações com o ambiente. Um único episódio é suficiente para aprender as tarefas investigadas na maioria dos experimentos. Os resultados são analisados a fim de explicar as propriedades do algoritmo obtido, e é observado que o uso da FIGMN como aproximador de funções oferece algumas importantes vantagens para aprendizagem por reforço em relação a redes neurais convencionais. / This thesis’ original contribution is a novel algorithm which integrates a data-efficient function approximator with reinforcement learning in continuous state spaces. The complete research includes the development of a scalable online and incremental algorithm capable of learning from a single pass through data. This algorithm, called Fast Incremental Gaussian Mixture Network (FIGMN), was employed as a sample-efficient function approximator for the state space of continuous reinforcement learning tasks, which, combined with linear Q-learning, results in competitive performance. Then, this same function approximator was employed to model the joint state and Q-values space, all in a single FIGMN, resulting in a concise and data-efficient algorithm, i.e., a reinforcement learning algorithm that learns from very few interactions with the environment. A single episode is enough to learn the investigated tasks in most trials. Results are analysed in order to explain the properties of the obtained algorithm, and it is observed that the use of the FIGMN function approximator brings some important advantages to reinforcement learning in relation to conventional neural networks. Informática : Educação Aprendizagem : Computador na educação Redes neurais Reinforcement learning Neural networks Gaussian mixture models
336	Machine Learning-driven Intrusion Detection Techniques in Critical Infrastructures Monitored by Sensor Networks Otoum, Safa 23 April 2019 (has links) In most of critical infrastructures, Wireless Sensor Networks (WSNs) are deployed due to their low-cost, flexibility and efficiency as well as their wide usage in several infrastructures. Regardless of these advantages, WSNs introduce various security vulnerabilities such as different types of attacks and intruders due to the open nature of sensor nodes and unreliable wireless links. Therefore, the implementation of an efficient Intrusion Detection System (IDS) that achieves an acceptable security level is a stimulating issue that gained vital importance. In this thesis, we investigate the problem of security provisioning in WSNs based critical monitoring infrastructures. We propose a trust based hierarchical model for malicious nodes detection specially for Black-hole attacks. We also present various Machine Learning (ML)-driven IDSs schemes for wirelessly connected sensors that track critical infrastructures. In this thesis, we present an in-depth analysis of the use of machine learning, deep learning, adaptive machine learning, and reinforcement learning solutions to recognize intrusive behaviours in the monitored network. We evaluate the proposed schemes by using KDD'99 as real attacks data-sets in our simulations. To this end, we present the performance metrics for four different IDSs schemes namely the Clustered Hierarchical Hybrid IDS (CHH-IDS), Adaptively Supervised and Clustered Hybrid IDS (ASCH-IDS), Restricted Boltzmann Machine-based Clustered IDS (RBC-IDS) and Q-learning based IDS (QL-IDS) to detect malicious behaviours in a sensor network. Through simulations, we analyzed all presented schemes in terms of Accuracy Rates (ARs), Detection Rates (DRs), False Negative Rates (FNRs), Precision-recall ratios, F_1 scores and, the area under curves (ROC curves) which are the key performance parameters for all IDSs. To this end, we show that QL-IDS performs with ~ 100% detection and accuracy rates. Wireless Sensor Networks (WSNs) Networks security Deep Learning Machine Learning Reinforcement Learning Intrusion Detection System
337	The Identification and Establishment of Reinforcement for Collaboration in Elementary Students Darcy, Laura Elizabeth January 2017 (has links) In Experiment 1, I conducted a functional analysis of student rate of learning with and without a peer-yoked contingency for 12 students in Kindergarten through 2nd grade in order to determine if they had conditioned reinforcement for collaboration. Using an ABAB reversal design, I compared rate of learning as measured by learn units to criterion under two conditions: (A) rotated learn units between 2 peers with a peer-yoked contingency game board (collaborative reinforcement), and (B) rotated learn units between 2 peers without a peer-yoked contingency game board (individual reinforcement). In these conditions, 2 students sat side by side and learn units were delivered in a rotated fashion. A peer-yoked contingency game board in which two characters competed to reach the top and access a predetermined backup reinforcer was present in the (A) condition only. The participants were yoked together on a “team,” and if they emitted a correct response, their character moved up on the game board. Seven of twelve participants learned faster in the collaborative reinforcement condition, suggesting that they each had reinforcement for collaboration with a peer. These data demonstrated how yoked contingencies can be utilized to increase rate of learning in the classroom when reinforcement for collaboration is present. The other participants learned slower in the collaborative reinforcement condition, suggesting that reinforcement for collaboration was not present for these participants. Additionally, participants who demonstrated reinforcement for collaboration emitted higher levels of vocal verbal operants when yoked with a peer than the participants who did not demonstrate reinforcement for collaboration. In Experiment 2, the participants who did not demonstrate reinforcement for collaboration were placed into a collaborative intervention, in order to determine if this potential developmental cusp could be established. In a delayed multiple probe design across dyads, four participants engaged in peer tutoring with a confederate peer, and a yoked contingency game board was utilized to reinforce their effective collaboration. Following this intervention, all four participants demonstrated a faster rate of learning when yoked with a peer, as well as increased levels of vocal verbal operants with their peers. These findings suggest that an intervention that specifically targets reinforcement for collaboration through a peer-yoked contingency may be effective in inducing reinforcement for collaboration. The educational significance and implications of this potential developmental cusp are discussed. Special education Psychology Education, Elementary Behavioral assessment Group work in education Reinforcement learning
338	The Effect of the Establishment of Reinforcement Value for Math on Rate of Learning for Pre-Kindergarten Students Maurilus, Emmy January 2018 (has links) The objective of Experiment I was to determine whether establishing conditioned reinforcement for engaging in math for pre-kindergarten students was possible using the three conditioning procedures outlined in previous research for conditioning book stimuli. The purpose of Experiment II was to determine whether this change in preference for engaging in math had an effect on 6 pre-kindergarten participants’ rate of learning math. In Experiment I a counterbalanced pre- and post-intervention ABAB/BABA functional analysis and a delayed multiple probe across dyads design, was used to measure the indirect and direct reinforcement value of math for each participant. Indirect measures referred to a functional analysis where the participants’ rate of responding to a performance task during a 1-min session when Play-Doh® was delivered as a reinforcer was compared to their rate of responding when math was delivered as a reinforcement operation. Direct measures referred to the number of 5-s intervals (out of 60) each participant engaged in math when given math worksheets and Play-Doh®. The individualized reinforcement intervention consisted of a sequence of conditioning procedures until a defined successful outcome resulted. First learn units were delivered, then stimulus-stimulus pairing, and then observational conditioning-by-denial. Learn unit instruction resulted in the establishment of conditioned reinforcement for the first dyad, while the stimulus-stimulus pairing procedure was necessary for the remaining dyad. The purpose of Experiment II was to test if establishing conditioned reinforcement for math would change rate of learning. The dependent variable was each participant’s rate of learning as measured by the number of learn units required to meet mastery criterion for 4 units of the Multiple Exemplar Functional Math (MEF-Math) curriculum. The dependent variable, rate of learning, was tested using a multiple probe design. The independent variable was the establishment of conditioned reinforcement for math using individualized reinforcement procedures as detailed in Experiment I. The intervention also consisted of a multiple probe design on testing the effect of the individualized reinforcement procedures on establishing conditioned reinforcement. Three participants required learn units, 2 participants required the stimulus-stimulus pairing procedure, and 1 participant required observational conditioning-by-denial to establish conditioned reinforcement for math. Results showed an educationally significant acceleration of learning following the establishment of conditioned reinforcement for math across all 6 participants. Results are discussed in terms of the significance of early math instruction. Psychology Reinforcement learning Early childhood education--Research
339	On FPGA implementations for bioinformatics, neural prosthetics and reinforcement learning problems. January 2005 (has links) Mak Sui Tung Terrence. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (leaves 132-142). / Abstracts in English and Chinese. / Abstract --- p.i / List of Tables --- p.iv / List of Figures --- p.v / Acknowledgements --- p.ix / Chapter 1. --- Introduction --- p.1 / Chapter 1.1 --- Bioinformatics --- p.1 / Chapter 1.2 --- Neural Prosthetics --- p.4 / Chapter 1.3 --- Learning in Uncertainty --- p.5 / Chapter 1.4 --- The Field Programmable Gate Array (FPGAs) --- p.7 / Chapter 1.5 --- Scope of the Thesis --- p.10 / Chapter 2. --- A Hybrid GA-DP Approach for Searching Equivalence Sets --- p.14 / Chapter 2.1 --- Introduction --- p.16 / Chapter 2.2 --- Equivalence Set Criterion --- p.18 / Chapter 2.3 --- Genetic Algorithm and Dynamic Programming --- p.19 / Chapter 2.3.1 --- Genetic Algorithm Formulation --- p.20 / Chapter 2.3.2 --- Bounded Mutation --- p.21 / Chapter 2.3.3 --- Conditioned Crossover --- p.22 / Chapter 2.3.4 --- Implementation --- p.22 / Chapter 2.4 --- FPGAs Implementation of GA-DP --- p.24 / Chapter 2.4.1 --- System Overview --- p.25 / Chapter 2.4.2 --- Parallel Computation for Transitive Closure --- p.26 / Chapter 2.4.3 --- Genetic Operation Realization --- p.28 / Chapter 2.5 --- Discussion --- p.30 / Chapter 2.6 --- Limitation and Future Work --- p.33 / Chapter 2.7 --- Conclusion --- p.34 / Chapter 3. --- An FPGA-based Architecture for Maximum-Likelihood Phylogeny Evaluation --- p.35 / Chapter 3.1 --- Introduction --- p.36 / Chapter 3.2 --- Maximum-Likelihood Model --- p.39 / Chapter 3.3 --- Hardware Mapping for Pruning Algorithm --- p.41 / Chapter 3.3.1 --- Related Works --- p.41 / Chapter 3.3.2 --- Number Representation --- p.42 / Chapter 3.3.3 --- Binary Tree Representation --- p.43 / Chapter 3.3.4 --- Binary Tree Traversal --- p.45 / Chapter 3.3.5 --- Maximum-Likelihood Evaluation Algorithm --- p.46 / Chapter 3.4 --- System Architecture --- p.49 / Chapter 3.4.1 --- Transition Probability Unit --- p.50 / Chapter 3.4.2 --- State-Parallel Computation Unit --- p.51 / Chapter 3.4.3 --- Error Computation --- p.54 / Chapter 3.5 --- Discussion --- p.56 / Chapter 3.5.1 --- Hardware Resource Consumption --- p.56 / Chapter 3.5.2 --- Delay Evaluation --- p.57 / Chapter 3.6 --- Conclusion --- p.59 / Chapter 4. --- Field Programmable Gate Array Implementation of Neuronal Ion Channel Dynamics --- p.61 / Chapter 4.1 --- Introduction --- p.62 / Chapter 4.2 --- Background --- p.63 / Chapter 4.2.1 --- Analog VLSI Model for Hebbian Synapse --- p.63 / Chapter 4.2.2 --- A Unifying Model of Bi-directional Synaptic Plasticity --- p.64 / Chapter 4.2.3 --- Non-NMDA Receptor Channel Regulation --- p.65 / Chapter 4.3 --- FPGAs Implementation --- p.65 / Chapter 4.3.1 --- FPGA Design Flow --- p.65 / Chapter 4.3.2 --- Digital Model of NMD A and AMPA receptors --- p.65 / Chapter 4.3.3 --- Synapse Modification --- p.67 / Chapter 4.4 --- Results --- p.68 / Chapter 4.4.1 --- Simulation Results --- p.68 / Chapter 4.5 --- Discussion --- p.70 / Chapter 4.6 --- Conclusion --- p.71 / Chapter 5. --- Continuous-Time and Discrete-Time Inference Networks for Distributed Dynamic Programming --- p.72 / Chapter 5.1 --- Introduction --- p.74 / Chapter 5.2 --- Background --- p.77 / Chapter 5.2.1 --- Markov decision process (MDPs) --- p.78 / Chapter 5.2.2 --- Learning in the MDPs --- p.80 / Chapter 5.2.3 --- Bellman Optimal Criterion --- p.80 / Chapter 5.2.4 --- Value Iteration --- p.81 / Chapter 5.3 --- A Computational Framework for Continuous-Time Inference Network --- p.82 / Chapter 5.3.1 --- Binary Relation Inference Network --- p.83 / Chapter 5.3.2 --- Binary Relation Inference Network for MDPs --- p.85 / Chapter 5.3.3 --- Continuous-Time Inference Network for MDPs --- p.87 / Chapter 5.4 --- Convergence Consideration --- p.88 / Chapter 5.5 --- Numerical Simulation --- p.90 / Chapter 5.5.1 --- Example 1: Random Walk --- p.90 / Chapter 5.5.2 --- Example 2: Random Walk on a Grid --- p.94 / Chapter 5.5.3 --- Example 3: Stochastic Shortest Path Problem --- p.97 / Chapter 5.5.4 --- Relationships Between λ and γ --- p.99 / Chapter 5.6 --- Discrete-Time Inference Network --- p.100 / Chapter 5.6.1 --- Results --- p.101 / Chapter 5.7 --- Conclusion --- p.102 / Chapter 6. --- On Distributed g-Learning Network --- p.104 / Chapter 6.1 --- Introduction --- p.105 / Chapter 6.2 --- Distributed Q-Learniing Network --- p.108 / Chapter 6.2.1 --- Distributed Q-Learning Network --- p.109 / Chapter 6.2.2 --- Q-Learning Network Architecture --- p.111 / Chapter 6.3 --- Experimental Results --- p.114 / Chapter 6.3.1 --- Random Walk --- p.114 / Chapter 6.3.2 --- The Shortest Path Problem --- p.116 / Chapter 6.4 --- Discussion --- p.120 / Chapter 6.4.1 --- Related Work --- p.121 / Chapter 6.5 --- FPGAs Implementation --- p.122 / Chapter 6.5.1 --- Distributed Registering Approach --- p.123 / Chapter 6.5.2 --- Serial BRAM Storing Approach --- p.124 / Chapter 6.5.3 --- Comparison --- p.125 / Chapter 6.5.4 --- Discussion --- p.127 / Chapter 6.6 --- Conclusion --- p.128 / Chapter 7. --- Summary --- p.129 / Bibliography --- p.132 / Appendix / Chapter A. --- Simplified Floating-Point Arithmetic --- p.143 / Chapter B. --- "Logarithm, Exponential and Division Implementation" --- p.144 / Chapter B.1 --- Introduction --- p.144 / Chapter B.2 --- Approximation Scheme --- p.145 / Chapter B.2.1 --- Logarithm --- p.145 / Chapter B.2.2 --- Exponentiation --- p.147 / Chapter B.2.3 --- Division --- p.148 / Chapter C. --- Analog VLSI Implementation --- p.150 / Chapter C.1 --- Site Function --- p.150 / Chapter C.1.1 --- Multiplication Cell --- p.150 / Chapter C.2 --- The Unit Function --- p.153 / Chapter C.3 --- The Inference Network Computation --- p.154 / Chapter C.4 --- Layout --- p.157 / Chapter C.5 --- Fabrication --- p.159 / Chapter C.5.1 --- Testing and Characterization --- p.161 Bioinformatics Neurons--Computer simulation Prosthesis Reinforcement learning
340	Control of wave energy converters using machine learning strategies Anderlini, Enrico January 2017 (has links) Wave energy converters are devices that are designed to extract power from ocean waves. Existing wave energy converter technologies are not financially viable yet. Control systems have been identified as one of the areas that can contribute the most towards the increase in energy absorption and reduction of loads acting on the structure, whilst incurring only minimal extra hardware costs. In this thesis, control schemes are developed for wave energy converters, with the focus on single isolated devices. Numerical models of increasing complexity are developed for the simulation of a point absorber, which is a type of wave energy converter with small dimensions with respect to the dominating wave length. After investigating state-of-the-art control schemes, the existing control strategies reported in the literature have been found to rely on the model of the system dynamics to determine the optimal control action. This is despite the fact that modelling errors can negatively affect the performance of the device, particularly in highly energetic waves when non-linear effects become more significant. Furthermore, the controller should be adaptive so that changes in the system dynamics, e.g. due to marine growth or non-critical subsystem failure, are accounted for. Hence, machine learning approaches have been investigated as an alternative, with a focus on neural networks and reinforcement learning for control applications. A time-averaged approach will be employed for the development of the control schemes to enable a practical implementation on WECs based on the standard in the industry at the moment. Neural networks are applied to the active control of a point absorber. They are used mainly for system identification, where the mean power is related to the current sea state and parameters of the power take-off unit. The developed control scheme presents a similar performance to optimal active control for the analysed simulations, which rely on linear hydrodynamics. Reinforcement learning is then applied to the passive and active control of a wave energy converter for the first time. The successful development of different control schemes is described in detail, focusing on the encountered challenges in the selection of states, actions and reward function. The performance of reinforcement learning is assessed against state-of-the-art control strategies. Reinforcement learning is shown to learn the optimal behaviour in a reasonable time frame, whilst recognizing each sea state without reliance on any models of the system dynamics. Additionally, the strategy is able to deal with model non-linearities. Furthermore, it is shown that the control scheme is able to adapt to changes in the device dynamics, as for instance due to marine growth. 621.3

Search results