151 |
A study of model-based average reward reinforcement learningOk, DoKyeong 09 May 1996 (has links)
Reinforcement Learning (RL) is the study of learning agents that improve
their performance from rewards and punishments. Most reinforcement learning
methods optimize the discounted total reward received by an agent, while, in many
domains, the natural criterion is to optimize the average reward per time step. In this
thesis, we introduce a model-based average reward reinforcement learning method
called "H-learning" and show that it performs better than other average reward and
discounted RL methods in the domain of scheduling a simulated Automatic Guided
Vehicle (AGV).
We also introduce a version of H-learning which automatically explores the
unexplored parts of the state space, while always choosing an apparently best action
with respect to the current value function. We show that this "Auto-exploratory H-Learning"
performs much better than the original H-learning under many previously
studied exploration strategies.
To scale H-learning to large state spaces, we extend it to learn action models
and reward functions in the form of Bayesian networks, and approximate its value
function using local linear regression. We show that both of these extensions are very
effective in significantly reducing the space requirement of H-learning, and in making
it converge much faster in the AGV scheduling task. Further, Auto-exploratory H-learning
synergistically combines with Bayesian network model learning and value
function approximation by local linear regression, yielding a highly effective average
reward RL algorithm.
We believe that the algorithms presented here have the potential to scale to
large applications in the context of average reward optimization. / Graduation date:1996
|
152 |
CADRKnight, Thomas F., Jr., Moon, David A., Holloway, Jack, Steele, Guy L., Jr. 01 May 1979 (has links)
The CADR machine, a revised version of the CONS machine, is a general-purpose, 32-bit microprogrammable processor which is the basis of the Lisp-machine system, a new computer system being developed by the Laboratory as a high-performance, economical implementation of Lisp. This paper describes the CADR processor and some of the associated hardware and low-level software.
|
153 |
Learning World Models in Environments with Manifest Causal StructureBergman, Ruth 05 May 1995 (has links)
This thesis examines the problem of an autonomous agent learning a causal world model of its environment. Previous approaches to learning causal world models have concentrated on environments that are too "easy" (deterministic finite state machines) or too "hard" (containing much hidden state). We describe a new domain --- environments with manifest causal structure --- for learning. In such environments the agent has an abundance of perceptions of its environment. Specifically, it perceives almost all the relevant information it needs to understand the environment. Many environments of interest have manifest causal structure and we show that an agent can learn the manifest aspects of these environments quickly using straightforward learning techniques. We present a new algorithm to learn a rule-based causal world model from observations in the environment. The learning algorithm includes (1) a low level rule-learning algorithm that converges on a good set of specific rules, (2) a concept learning algorithm that learns concepts by finding completely correlated perceptions, and (3) an algorithm that learns general rules. In addition this thesis examines the problem of finding a good expert from a sequence of experts. Each expert has an "error rate"; we wish to find an expert with a low error rate. However, each expert's error rate and the distribution of error rates are unknown. A new expert-finding algorithm is presented and an upper bound on the expected error rate of the expert is derived.
|
154 |
The analysis and synthesis of stepped shafts using an interactive approach.Flinner, Victor J. January 1982 (has links)
Thesis (M.S.)--Ohio State University. / Includes bibliographical references. Available online via OhioLINK's ETD Center
|
155 |
Non-linear Latent Factor Models for Revealing Structure in High-dimensional DataMemisevic, Roland 28 July 2008 (has links)
Real world data is not random: The variability in the data-sets that arise in computer vision,
signal processing and other areas is often highly constrained and governed by a number of
degrees of freedom that is much smaller than the superficial dimensionality of the data.
Unsupervised learning methods can be used to automatically discover the “true”, underlying
structure in such data-sets and are therefore a central component in many systems that deal
with high-dimensional data.
In this thesis we develop several new approaches to modeling the low-dimensional structure
in data. We introduce a new non-parametric framework for latent variable modelling, that in
contrast to previous methods generalizes learned embeddings beyond the training data and its
latent representatives. We show that the computational complexity for learning and applying
the model is much smaller than that of existing methods, and we illustrate its applicability
on several problems.
We also show how we can introduce supervision signals into latent variable models using
conditioning. Supervision signals make it possible to attach “meaning” to the axes of a latent
representation and to untangle the factors that contribute to the variability in the data. We
develop a model that uses conditional latent variables to extract rich distributed representations
of image transformations, and we describe a new model for learning transformation
features in structured supervised learning problems.
|
156 |
Automatic Segmentation of Lung Carcinoma Using 3D Texture Features in Co-registered 18-FDG PET/CT ImagesMarkel, Daniel 14 December 2011 (has links)
Variability between oncologists in defining the tumor during radiation therapy planning
can be as high as 700% by volume. Robust, automated definition of tumor boundaries
has the ability to significantly improve treatment accuracy and efficiency. However, the information provided in computed tomography (CT) is not sensitive enough to differences between tumor and healthy tissue and positron emission tomography (PET) is hampered by blurriness and low resolution. The textural characteristics of thoracic tissue was investigated and compared with those of tumors found within 21 patient PET and CT images in order to enhance the differences and the boundary between cancerous and healthy tissue. A pattern recognition approach was used from these samples to learn the textural characteristics of each and classify voxels as being either normal or abnormal.
The approach was compared to a number of alternative methods and found to have the
highest overlap with that of an oncologist's tumor definition.
|
157 |
Automatic Segmentation of Lung Carcinoma Using 3D Texture Features in Co-registered 18-FDG PET/CT ImagesMarkel, Daniel 14 December 2011 (has links)
Variability between oncologists in defining the tumor during radiation therapy planning
can be as high as 700% by volume. Robust, automated definition of tumor boundaries
has the ability to significantly improve treatment accuracy and efficiency. However, the information provided in computed tomography (CT) is not sensitive enough to differences between tumor and healthy tissue and positron emission tomography (PET) is hampered by blurriness and low resolution. The textural characteristics of thoracic tissue was investigated and compared with those of tumors found within 21 patient PET and CT images in order to enhance the differences and the boundary between cancerous and healthy tissue. A pattern recognition approach was used from these samples to learn the textural characteristics of each and classify voxels as being either normal or abnormal.
The approach was compared to a number of alternative methods and found to have the
highest overlap with that of an oncologist's tumor definition.
|
158 |
Non-linear Latent Factor Models for Revealing Structure in High-dimensional DataMemisevic, Roland 28 July 2008 (has links)
Real world data is not random: The variability in the data-sets that arise in computer vision,
signal processing and other areas is often highly constrained and governed by a number of
degrees of freedom that is much smaller than the superficial dimensionality of the data.
Unsupervised learning methods can be used to automatically discover the “true”, underlying
structure in such data-sets and are therefore a central component in many systems that deal
with high-dimensional data.
In this thesis we develop several new approaches to modeling the low-dimensional structure
in data. We introduce a new non-parametric framework for latent variable modelling, that in
contrast to previous methods generalizes learned embeddings beyond the training data and its
latent representatives. We show that the computational complexity for learning and applying
the model is much smaller than that of existing methods, and we illustrate its applicability
on several problems.
We also show how we can introduce supervision signals into latent variable models using
conditioning. Supervision signals make it possible to attach “meaning” to the axes of a latent
representation and to untangle the factors that contribute to the variability in the data. We
develop a model that uses conditional latent variables to extract rich distributed representations
of image transformations, and we describe a new model for learning transformation
features in structured supervised learning problems.
|
159 |
Control of a Synchronous MachineOlofsson, Jens January 2010 (has links)
The VAWT project at Uppsala University has successfully managed to develop a vertical axis wind turbine (VAWT). The VAWT has many benefits compared to the Horizontal axis wind turbines (HAWT) which are the most common wind turbine design today. One of the many advantages with the VAWT is that it allows the generator to be located on the ground level. That reduces the required tower strength. The wind turbine is not self starting, i.e. the turbine needs a certain speed before the wind can force the turbine to revolve. The wind turbine is therefore in need of special start procedure. During the start, power electronics is used to operate the generator as a motor. Today Hall latches are located in the air gap of the generator which provides the signals that govern the power electronics. However, there is a demand to have a start that does not require Hall latches. Such controller would increase the reliability of the starter system. The design of the wind turbine could be even more simplified. Hence, the diploma work treats a programmed microcontroller to control the start-up without using any sensors at all. A hub motor was obtained for laboratory work, a driver and an inverter were constructed to drive the motor using a microcontroller. The finished start-up program has the ability to start the hub motor both sensorless and using Hall sensors. The microcontroller controls the motor by measuring the phase voltages of the motor. This information is used to decide which phases of the motor the electric current should go through. The current to the motor is limited using pulse width modulation strategy (PWM). Current limitation is necessary to protect the power electronics and limit the torque during the start. The result of start-ups using both Hall sensor and sensorless showed that the two start strategies are able to accelerate the rotor at the same rate. However, the start-ups using Hall sensors reached a higher top speed than the sensorless starts. However, the wind turbine is not in need of a higher speed than what the sensorless start was able to reach. Thus, the sensorless start is considered to be as good as the start using Hall sensors.
|
160 |
DESIGN AND IMPLEMENTATION OF WIRELESSHART TDMA STATE MACHINEKannah, Ali, Bahiya, Ghasaq January 2011 (has links)
WirelessHART is one of the latest communication standards that have enhanced functionality and robustness. The standard is ideal for applications in control and automation industry. In this work we present an implementation the TDMA state machine of the WirelessHART communication protocol on TinyOS operating system using the nesC language for the Telobs motes. The development was carried out using software reuse principle and involved comparing the state diagram description of the TDMA presented in WirelessHART with that of the time synchronized frequency hopping implementation that was already available for reuse. The work highlights the differences between the TSCH code and the WirelessHART specifications and builds upon the TSCH code to arrive at the WirelessHART TDMA state machine implementation.
|
Page generated in 0.1453 seconds