Global ETD Search

171	Oppositional Reinforcement Learning with Applications Shokri, Maryam 05 September 2008 (has links) Machine intelligence techniques contribute to solving real-world problems. Reinforcement learning (RL) is one of the machine intelligence techniques with several characteristics that make it suitable for the applications, for which the model of the environment is not available to the agent. In real-world applications, intelligent agents generally face a very large state space which limits the usability of reinforcement learning. The condition for convergence of reinforcement learning implies that each state-action pair must be visited infinite times, a condition which can be considered impossible to be satisfied in many practical situations. The goal of this work is to propose a class of new techniques to overcome this problem for off-policy, step-by-step (incremental) and model-free reinforcement learning with discrete state and action space. The focus of this research is using the design characteristics of RL agent to improve its performance regarding the running time while maintaining an acceptable level of accuracy. One way of improving the performance of the intelligent agents is using the model of environment. In this work, a special type of knowledge about the agent actions is employed to improve its performance because in many applications the model of environment may only be known partially or not at all. The concept of opposition is employed in the framework of reinforcement learning to achieve this goal. One of the components of RL agent is the action. For each action we define its associate opposite action. The actions and opposite actions are implemented in the framework of reinforcement learning to update the value function resulting in a faster convergence. At the beginning of this research the concept of opposition is incorporated in the components of reinforcement learning, states, actions, and reinforcement signal which results in introduction of the oppositional target domain estimation algorithm, OTE. OTE reduces the search and navigation area and accelerates the speed of search for a target. The OTE algorithm is limited to the applications, in which the model of the environment is provided for the agent. Hence, further investigation is conducted to extend the concept of opposition to the model-free reinforcement learning algorithms. This extension contributes to the generating of several algorithms based on using the concept of opposition for Q(lambda) technique. The design of reinforcement learning agent depends on the application. The emphasize of this research is on the characteristics of the actions. Hence, the primary challenge of this work is design and incorporation of the opposite actions in the framework of RL agents. In this research, three different applications, namely grid navigation, elevator control problem, and image thresholding are implemented to address this challenge in context of different applications. The design challenges and some solutions to overcome the problems and improve the algorithms are also investigated. The opposition-based Q(lambda) algorithms are tested for the applications mentioned earlier. The general idea behind the opposition-based Q(lambda) algorithms is that in Q-value updating, the agent updates the value of an action in a given state. Hence, if the agent knows the value of opposite action then instead of one value, the agent can update two Q-values at the same time without taking its corresponding opposite action causing an explicit transition to opposite state. If the agent knows both values of action and its opposite action for a given state, then it can update two Q-values. This accelerates the learning process in general and the exploration phase in particular. Several algorithms are outlined in this work. The OQ(lambda) will be introduced to accelerate Q(lambda) algorithm in discrete state spaces. The NOQ(lambda) method is an extension of OQ(lambda) to operate in a broader range of non-deterministic environments. The update of the opposition trace in OQ(lambda) depends on the next state of the opposite action (which generally is not taken by the agent). This limits the usability of this technique to the deterministic environments because the next state should be known to the agent. NOQ(lambda) will be presented to update the opposition trace independent of knowing the next state for the opposite action. The results show the improvement of the performance in terms of running time for the proposed algorithms comparing to the standard Q(lambda) technique. Reinforcement learning opposition-based learning OQ(lambda) NOQ(lambda) System Design Engineering
172	A Strain Energy Function for Large Deformations of Curved Beams Mackenzie, Ian January 2008 (has links) This thesis develops strain and kinetic energy functions and a finite beam element useful for analyzing curved beams which go through large deflections, such as a hockey stick being swung and bent substantially as it hits the ice. The resulting beam model is demonstrated to be rotation invariant and capable of computing the correct strain energy and reaction forces for a specified deformation. A method is also described by which the model could be used to perform static or dynamic simulations of a beam. curved beam bending strain energy splines finite element System Design Engineering
173	Storage System Management Using Reinforcement Learning Techniques and Nonlinear Models Mahootchi, Masoud January 2009 (has links) In this thesis, modeling and optimization in the field of storage management under stochastic condition will be investigated using two different methodologies: Simulation Optimization Techniques (SOT), which are usually categorized in the area of Reinforcement Learning (RL), and Nonlinear Modeling Techniques (NMT). For the first set of methods, simulation plays a fundamental role in evaluating the control policy: learning techniques are used to deliver sub-optimal policies at the end of a learning process. These iterative methods use the interaction of agents with the stochastic environment through taking actions and observing different states. To converge to the steady-state condition where policies and value functions do not change significantly with the continuation of the learning process, all or most important states must be visited sufficiently. This might be prohibitively time-consuming for large-scale problems. To make these techniques more efficient both in terms of computation time and robust optimal policies, the idea of Opposition-Based Learning (OBL-Type I and Type II) is employed to modify/extend popular RL techniques including Q-Learning, Q(λ), sarsa, and sarsa(λ). Several new algorithms are developed using this idea. It is also illustrated that, function approximation techniques such as neural networks can contribute to the process of learning. The state-of-the-art implementations usually consider the maximization of expected value of accumulated reward. Extending these techniques to consider risk and solving some well-known control problems are important contributions of this thesis. Furthermore, the new nonlinear modeling for reservoir management using indicator functions and randomized policy introduced by Fletcher and Ponnambalam, is extended to stochastic releases in multi-reservoir systems. In this extension, two different approaches for defining the release policies are proposed. In addition, the main restriction of considering the normal distribution for inflow is relaxed by using a beta-equivalent general distribution. A five-reservoir case study from India is used to demonstrate the benefits of these new developments. Using a warehouse management problem as an example, application of the proposed method to other storage management problems is outlined. reinforcement learning nonlinear models storage management reservoir management System Design Engineering
174	Neuromuscular Clinical Decision Support using Motor Unit Potentials Characterized by 'Pattern Discovery' Pino, Lou Joseph January 2008 (has links) Objectives: Based on the analysis of electromyographic (EMG) data muscles are often characterized as normal or affected by a neuromuscular disease process. A clinical decision support system (CDSS) for the electrophysiological characterization of muscles by analyzing motor unit potentials (MUPs) was developed to assist physicians and researchers with the diagnosis, treatment & management of neuromuscular disorders and analyzed against criteria for use in a clinical setting. Methods: Quantitative MUP data extracted from various muscles from control subjects and patients from a number of clinics was used to compare the sensitivity, specificity, and accuracy of a number of different clinical decision support methods. The CDSS developed in this work known as AMC-PD has three components: MUP characterization using Pattern Discovery (PD), muscle characterization by taking the average of MUP characterizations and calibrated muscle characterizations. Results: The results demonstrated that AMC-PD achieved higher accuracy than conventional means and outlier analysis. Duration, thickness and number of turns were the most discriminative MUP features for characterizing the muscles studied in this work. Conclusions: AMC-PD achieved higher accuracy than conventional means and outlier analysis. Muscle characterization performed using AMC-PD can facilitate the determination of “possible”, “probable”, or “definite” levels of disease whereas the conventional means and outlier methods can only provide a dichotomous “normal” or “abnormal” decision. Therefore, AMC-PD can be directly used to support clinical decisions related to initial diagnosis as well as treatment and management over time. Decisions are based on facts and not impressions giving electromyography a more reliable role in the diagnosis, management, and treatment of neuromuscular disorders. AMC-PD based calibrated muscle characterization can help make electrophysiological examinations more accurate and objective. decision support pattern recognition quantitative electromyography motor unit potentials System Design Engineering
175	Automated Epileptic Seizure Onset Detection Dorai, Arvind 21 April 2009 (has links) Epilepsy is a serious neurological disorder characterized by recurrent unprovoked seizures due to abnormal or excessive neuronal activity in the brain. An estimated 50 million people around the world suffer from this condition, and it is classified as the second most serious neurological disease known to humanity, after stroke. With early and accurate detection of seizures, doctors can gain valuable time to administer medications and other such anti-seizure countermeasures to help reduce the damaging effects of this crippling disorder. The time-varying dynamics and high inter-individual variability make early prediction of a seizure state a challenging task. Many studies have shown that EEG signals do have valuable information that, if correctly analyzed, could help in the prediction of seizures in epileptic patients before their occurrence. Several mathematical transforms have been analyzed for its correlation with seizure onset prediction and a series of experiments were done to certify their strengths. New algorithms are presented to help clarify, monitor, and cross-validate the classification of EEG signals to predict the ictal (i.e. seizure) states, specifically the preictal, interictal, and postictal states in the brain. These new methods show promising results in detecting the presence of a preictal phase prior to the ictal state. Epilepsy Seizure Ictal EEG Prediction Wavelet Entropy Synchronization Chaos Coherence Signals System Design Engineering
176	Statistical Fusion of Scientific Images Mohebi, Azadeh 30 July 2009 (has links) A practical and important class of scientific images are the 2D/3D images obtained from porous materials such as concretes, bone, active carbon, and glass. These materials constitute an important class of heterogeneous media possessing complicated microstructure that is difficult to describe qualitatively. However, they are not totally random and there is a mixture of organization and randomness that makes them difficult to characterize and study. In order to study different properties of porous materials, 2D/3D high resolution samples are required. But obtaining high resolution samples usually requires cutting, polishing and exposure to air, all of which affect the properties of the sample. Moreover, 3D samples obtained by Magnetic Resonance Imaging (MRI) are very low resolution and noisy. Therefore, artificial samples of porous media are required to be generated through a porous media reconstruction process. The recent contributions in the reconstruction task are either only based on a prior model, learned from statistical features of real high resolution training data, and generating samples from that model, or based on a prior model and the measurements. The main objective of this thesis is to some up with a statistical data fusion framework by which different images of porous materials at different resolutions and modalities are combined in order to generate artificial samples of porous media with enhanced resolution. The current super-resolution, multi-resolution and registration methods in image processing fail to provide a general framework for the porous media reconstruction purpose since they are usually based on finding an estimate rather than a typical sample, and also based on having the images from the same scene -- the case which is not true for porous media images. The statistical fusion approach that we propose here is based on a Bayesian framework by which a prior model learned from high resolution samples are combined with a measurement model defined based on the low resolution, coarse-scale information, to come up with a posterior model. We define a measurement model, in the non-hierachical and hierarchical image modeling framework, which describes how the low resolution information is asserted in the posterior model. Then, we propose a posterior sampling approach by which 2D posterior samples of porous media are generated from the posterior model. A more general framework that we propose here is asserting other constraints rather than the measurement in the model and then propose a constrained sampling strategy based on simulated annealing to generate artificial samples. Statistical Fusion Porous Media Image Reconstruction Sampling Resolution Enhancement System Design Engineering
177	Segmentation of RADARSAT-2 Dual-Polarization Sea Ice Imagery Yu, Peter January 2009 (has links) The mapping of sea ice is an important task for understanding global climate and for safe shipping. Currently, sea ice maps are created by human analysts with the help of remote sensing imagery, including synthetic aperture radar (SAR) imagery. While the maps are generally correct, they can be somewhat subjective and do not have pixel-level resolution due to the time consuming nature of manual segmentation. Therefore, automated sea ice mapping algorithms such as the multivariate iterative region growing with semantics (MIRGS) sea ice image segmentation algorithm are needed. MIRGS was designed to work with one-channel single-polarization SAR imagery from the RADARSAT-1 satellite. The launch of RADARSAT-2 has made available two-channel dual-polarization SAR imagery for the purposes of sea ice mapping. Dual-polarization imagery provides more information for distinguishing ice types, and one of the channels is less sensitive to changes in the backscatter caused by the SAR incidence angle parameter. In the past, this change in backscatter due to the incidence angle was a key limitation that prevented automatic segmentation of full SAR scenes. This thesis investigates techniques to make use of the dual-polarization data in MIRGS. An evaluation of MIRGS with RADARSAT-2 data was performed and showed that some detail was lost and that the incidence angle caused errors in segmentation. Several data fusion schemes were investigated to determine if they can improve performance. Gradient generation methods designed to take advantage of dual-polarization data, feature space fusion using linear and non-linear transforms as well as image fusion methods based on wavelet combination rules were implemented and tested. Tuning of the MIRGS parameters was performed to find the best set of parameters for segmentation of dual-polarization data. Results show that the standard MIRGS algorithm with default parameters provides the highest accuracy, so no changes are necessary for dual-polarization data. A hierarchical segmentation scheme that segments the dual-polarization channels separately was implemented to overcome the incidence angle errors. The technique is effective but requires more user input than the standard MIRGS algorithm. segmentation sea ice data fusion image fusion Markov random field RADARSAT-2 SAR System Design Engineering
178	Transient Dynamics of Continuous Systems with Impact and Friction, with Applications to Musical Instruments Vyasarayani, Chandrika Prakash 18 September 2009 (has links) The objective of this work is to develop mathematical simulation models for predicting the transient behaviour of strings and beams subjected to impacts. The developed models are applied to study the dynamics of the piano and the sitar. For simulating rigid point impacts on continuous systems, a new method is proposed based on the unit impulse response. The developed method allows one to relate modal velocities before and after impact, without requiring the integration of the system equations of motion during impact. The proposed method has been used to model the impact of a pinned-pinned beam with a rigid obstacle. Numerical simulations are presented to illustrate the inability of the collocation-based coefficient of restitution method to predict an accurate and energy-consistent response. The results using the unit-impulse-based coefficient of restitution method are also compared to those obtained with a penalty approach,with good agreement. A new moving boundary formulation is presented to simulate wrapping contacts in continuous systems impacting rigid distributed obstacles. The free vibration response of an ideal string impacting a distributed parabolic obstacle located at its boundary is analyzed to understand and simulate a sitar string. The portion of the string in contact with the obstacle is governed by a different partial differential equation (PDE) from the free portion represented by the classical string equation. These two PDEs and corresponding boundary conditions, along with the transversality condition that governs the dynamics of the moving boundary, are obtained using Hamilton's principle. A Galerkin approximation is used to convert them into a system of nonlinear ordinary differential equations, with time-dependent mode-shapes as basis functions. The advantages and disadvantages of the proposed method are discussed in comparison to the penalty approach for simulating wrapping contacts. Finally, the model is used to investigate the mechanism behind the generation of the buzzing tone in a sitar. An alternate formulation using the penalty approach is also proposed, and the results are contrasted with those obtained using the moving boundary approach. A model for studying the interaction between a flexible beam and a string at a point including friction has also been developed. This model is used to study the interaction between a piano hammer and the string. A realistic model of the piano hammer-string interaction must treat both the action mechanism and the string. An elastic stiff string model is integrated with a dynamic model of a compliant piano action mechanism with a flexible hammer shank. Simulations have been used to compare the mechanism response for impact on an elastic string and a rigid stop. Hammer head scuffing along the string, as well as length of time in contact, were found to increase where an elastic string was used, while hammer shank vibration amplitude and peak contact force decreased. Introducing hammer-string friction decreases the duration of contact and reduces the extent of scuffing. Finally, significant differences in hammer and string motion were predicted for a highly flexible hammer shank. Initial contact time and location, length of contact period, peak contact force, hammer vibration amplitude, scuffing extent, and string spectral content were all influenced. Continuous systems Impact modelling Musical instruments Piano Sitar System Design Engineering
179	Symbolic Modelling and Simulation of Wheeled Vehicle Systems on Three-Dimensional Roads Bombardier, William January 2009 (has links) In recent years, there has been a push by automotive manufacturers to improve the efficiency of the vehicle development process. This can be accomplished by creating a computationally efficient vehicle model that has the capability of predicting the vehicle behavior in many different situations at a fast pace. This thesis presents a procedure to automatically generate the simulation code of vehicle systems rolling over three-dimensional (3-D) roads given a description of the model as input. The governing equations describing the vehicle can be formulated using either a numerical or symbolical formulation approach. A numerical approach will re-construct numerical matrices that describe the system at each time step. Whereas a symbolic approach will generate the governing equations that describe the system for all time. The latter method offers many advantages to obtaining the equations. They only have to be formulated once and can be simplified using symbolic simplification techniques, thus making the simulations more computationally efficient. The road model is automatically generated in the formulation stage based on the single elevation function (3-D mathematical function) that is used to represent the road. Symbolic algorithms are adopted to construct and optimize the non-linear equations that are required to determine the contact point. A Newton-Raphson iterative scheme is constructed around the optimized non-linear equations, so that they can be solved at each time step. The road is represented in tabular form when it can not be defined by a single elevation function. A simulation code structure was developed to incorporate the tire on a 3-D road in a symbolic computer implementation of vehicle systems. It was created so that the tire forces and moments that appear in the generalized force matrix can be evaluated during simulation and not during formulation. They are evaluated systematically by performing a number of procedure calls. A road model is first used to determine the contact point between the tire and the ground. Its location is used to calculate the tire intermediate variables, such as the camber angle, that are required by a tire model to evaluate the tire forces and moments. The structured simulation code was implemented in the DynaFlexPro software package by creating a linear graph representation of the tire and the road. DynaFlexPro was used to analyze a vehicle system on six different road profiles performing different braking and cornering maneuvers. The analyzes were repeated in MSC.ADAMS for validation purposes and good agreement was achieved between the two software packages. The results confirmed that the symbolic computing approach presented in this thesis is more computationally efficient than the purely numerical approach. Thus, the simulation code structure increases the versatility of vehicle models by permitting them to be analyzed on 3-D trajectories while remaining computationally efficient. Symbolic Computation Vehicle Dynamics Road Models Tire Models Graph Theory System Design Engineering
180	The Design and Validation of Virtual Trailblazing and Guidance Interfaces for the VTrail System Iaboni, Daniel January 2009 (has links) Wayfinding is a complex skill and the lack of tools supporting the specific sub-types of navigation hinders performance in large-scale virtual environments and consequently can slow the adoption of virtual technology for training. The VTrail System is designed to support virtual training by providing trainers (trailblazers) with the ability to create trails to guide users (trail followers) during training simulations. Without an effective interface to assist with creating trails, the task of trailblazing remains difficult. The objective of this research was to design a default interface for the VTrail System that adheres to the basic human factors engineering guidelines of simplicity, universality, and that does not interfere with primary task performance. Two studies (trailblazing, trail following), with a total of four experiments, were performed to evaluate and modify the proposed interfaces. The first experiments in each study determined that the proposed default interfaces are simple enough to use so as to not interfere with primary task performance. The second set of experiments found that, aside from the interface components included in the default interface, novice trailblazers and trail followers did not make use of any additional wayfinding aids when users were provided with the ability to create a custom interface. Secondary benefits included; the development of a novel approach for measuring spatial knowledge acquisition (called the SKAT), a set of criteria for qualitative analysis of trail quality in the form of the Trail Quality Questionnaire (referred to as TQQ), and improved understanding of the role individual differences, such as gender and spatial ability, in wayfinding performance. The high correlation between spatial ability score and performance on the SKAT suggests that the test provides a valid means of measuring spatial knowledge acquisition in a virtual environment. A measurable difference in the trail quality between males and females indicates that the TQQ can distinguish between trails of variable quality. Finally, there are measurable gender performance differences, despite similar levels in spatial ability between the genders. With the proposed interface designs the VTrail is closer to being ready to be incorporated as a support tool into virtual training programs. In addition, the designs for the VTrail System can be adapted for other platforms to support trailblazing in a range of applications, from use in military operations to providing an enhanced tourism experience. This research also serves as a starting point for future research projects on topics ranging from improving the design of the SKAT measure to understanding the effect of expertise on trailblazing performance. Wayfinding Virtual Human Factors Trailblazing Trail Following Spatial Knowledge Aquisition System Design Engineering

Search results