Global ETD Search

341	The Effects of Conditioned Reinforcement for Reading on Reading Comprehension for 5th Graders Cumiskey Moore, Colleen January 2017 (has links) In three experiments, I tested the effects of the conditioned reinforcement for reading (R+Reading) on reading comprehension with 5th graders. In Experiment 1, I conducted a series of statistical analyses with data from 18 participants for one year. I administered 4 pre/post measurements for reading repertoires which included: 1) state-wide assessments, 2) district-wide assessments, 3) 20 min observational probes, and 4) preference probes. I utilized the standardized testing measurements to establish grade-level reading repertoires, while the additional two probes measured the reinforcement value of reading. Observational data were recorded in 10s whole-intervals; participants who were observed to read for 96 of the 120 intervals (80%) were considered to have R+Reading. The results demonstrated that R+Reading is significantly correlated with reading assessment outcomes. In Experiment 2, I implemented a two-year cross-sectional design with 33 participants, where I expanded the previous research to include probe trials for conditioned seeing (CS) and derivational responding (DR). Results of Experiment 2 indicated that increases in standardized testing scores were significantly correlated with R+Reading, and that CS and DR were prerequisite repertoires for the acquisition of R+Reading. In Experiment 3, I tested the effects of the peer-yoked contingency procedure on the reinforcement value of reading and assessed if increases in the reinforcement value of reading functioned to increase reading comprehension. Results indicated that increases in the reinforcement value of reading also was related to increases in reading comprehension. Psychology Reading Behavioral assessment Reinforcement learning
342	Seleção de abstração espacial no Aprendizado por Reforço avaliando o processo de aprendizagem / Selection of spatial abstraction in Reinforcement Learning by learning process evaluating Cleiton Alves da Silva 14 June 2017 (has links) Agentes que utilizam técnicas de Aprendizado por Reforço (AR) buscam resolver problemas que envolvem decisões sequenciais em ambientes estocásticos sem conhecimento a priori. O processo de aprendizado desenvolvido pelo agente em geral é lento, visto que se concretiza por tentativa e erro e exige repetidas interações com cada estado do ambiente e como o estado do ambiente é representado por vários fatores, a quantidade de estados cresce exponencialmente de acordo com o número de variáveis de estado. Uma das técnicas para acelerar o processo de aprendizado é a generalização de conhecimento, que visa melhorar o processo de aprendizado, seja no mesmo problema por meio da abstração, ao explorar a similaridade entre estados semelhantes ou em diferentes problemas, ao transferir o conhecimento adquirido de um problema fonte para acelerar a aprendizagem em um problema alvo. Uma abstração considera partes do estado e, ainda que uma única não seja suficiente, é necessário descobrir qual combinação de abstrações pode atingir bons resultados. Nesta dissertação é proposto um método para seleção de abstração, considerando o processo de avaliação da aprendizagem durante o aprendizado. A contribuição é formalizada pela apresentação do algoritmo REPO, utilizado para selecionar e avaliar subconjuntos de abstrações. O algoritmo é iterativo e a cada rodada avalia novos subconjuntos de abstrações, conferindo uma pontuação para cada uma das abstrações existentes no subconjunto e por fim, retorna o subconjunto com as abstrações melhores pontuadas. Experimentos com o simulador de futebol mostram que esse método é efetivo e consegue encontrar um subconjunto com uma quantidade menor de abstrações que represente o problema original, proporcionando melhoria em relação ao desempenho do agente em seu aprendizado / Agents that use Reinforcement Learning (RL) techniques seek to solve problems that involve sequential decisions in stochastic environments without a priori knowledge. The learning process developed by the agent in general is slow, since it is done by trial and error and requires repeated iterations with each state of the environment and because the state of the environment is represented by several factors, the number of states grows exponentially according to the number of state variables. One of the techniques to accelerate the learning process is the generalization of knowledge, which aims to improve the learning process, be the same problem through abstraction, explore the similarity between similar states or different problems, transferring the knowledge acquired from A source problem to accelerate learning in a target problem. An abstraction considers parts of the state, and although a single one is not sufficient, it is necessary to find out which combination of abstractions can achieve good results. In this work, a method for abstraction selection is proposed, considering the evaluation process of learning during learning. The contribution is formalized by the presentation of the REPO algorithm, used to select and evaluate subsets of features. The algorithm is iterative and each round evaluates new subsets of features, giving a score for each of the features in the subset, and finally, returns the subset with the most highly punctuated features. Experiments with the soccer simulator show that this method is effective and can find a subset with a smaller number of features that represents the original problem, providing improvement in relation to the performance of the agent in its learning Aprendizado por Reforço Seleção de abstração Transferência do conhecimento Abstraction selection Reinforcement Learning Transfer learning
343	Uso de política abstrata estocástica na navegação robótica. / Using stochastic abstract policies in robotic navigation. Tiago Matos 06 September 2011 (has links) A maioria das propostas de planejamento de rotas para robôs móveis não leva em conta a existência de soluções para problemas similares ao aprender a política para resolver um novo problema; e devido a isto, o problema de navegação corrente deve ser resolvido a partir do zero, o que pode ser excessivamente custoso em relação ao tempo. Neste trabalho é realizado o acoplamento do conhecimento prévio obtido de soluções similares, representado por uma política abstrata, a um processo de aprendizado por reforço. Além disto, este trabalho apresenta uma arquitetura para o aprendizado por reforço simultâneo, de nome ASAR, onde a política abstrata auxilia na inicialização da política para o problema concreto, e ambas as políticas são refinadas através da exploração. A fim de reduzir a perda de informação na construção da política abstrata é proposto um algoritmo, nomeado X-TILDE, que constrói uma política abstrata estocástica. A arquitetura proposta é comparada com um algoritmo de aprendizado padrão e os resultados demonstram que ela é eficaz em acelerar a construção da política para problemas práticos. / Most work in path-planning approaches for mobile robots does not take into account existing solutions to similar problems when learning a policy to solve a new problem, and consequently solves the current navigation problem from scratch, what can be very time consuming. In this work we couple a prior knowledge obtained from a similar solution to a reinforcement learning process. The prior knowledge is represented by an abstract policy. In addition, this work presents a framework for simultaneous reinforcement learning called ASAR, where the abstract policy helps start up the policy for the concrete problem, and both policies are refined through exploration. For the construction of the abstract policy we propose an algorithm called X-TILDE, that builds a stochastic abstract policy, in order to reduce the loss of information. The proposed framework is compared with a default learning algorithm and the results show that it is effective in speeding up policy construction for practical problems. Aprendizado computacional relacional Inteligência artificial Robótica Knowledge transfer Policy abstraction Reinforcement learning Relational MDP Robotic navigation
344	RL-based portfolio management system. January 2008 (has links) Tsue, Wing Yeung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (leaves 94-100). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.vii / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Reinforcement Learning (RL) --- p.7 / Chapter 2.1 --- Objective of RL --- p.8 / Chapter 2.2 --- Algorithms in RL --- p.9 / Chapter 2.2.1 --- Dynamic Programming --- p.9 / Chapter 2.2.2 --- Monte Carlo Methods --- p.11 / Chapter 2.2.3 --- Temporal-Difference Learning and Q-Learning --- p.12 / Chapter 2.3 --- Example: Maze --- p.13 / Chapter 2.4 --- Artificial Neural Network to Approximate Q-Function --- p.14 / Chapter 2.5 --- Literatures on Trading a Single Asset by RL --- p.16 / Chapter 2.6 --- Literatures on Portfolio Management by RL --- p.19 / Chapter 2.7 --- Summary --- p.20 / Chapter 3 --- Portfolio Management (PM) --- p.21 / Chapter 3.1 --- Buy-and-Hold Strategy --- p.22 / Chapter 3.2 --- Mean-Variance Analysis --- p.23 / Chapter 3.3 --- Constant Rebalancing Algorithm --- p.24 / Chapter 3.4 --- Universal Portfolio Algorithm --- p.25 / Chapter 3.5 --- ANTI COR Algorithm --- p.26 / Chapter 4 --- PM on RL Traders --- p.30 / Chapter 4.1 --- Implementation of Single-Asset RL Traders --- p.32 / Chapter 4.1.1 --- State Formation --- p.32 / Chapter 4.1.2 --- Actions and Immediate Reward --- p.38 / Chapter 4.1.3 --- Update --- p.38 / Chapter 4.2 --- Experiments --- p.41 / Chapter 4.3 --- Discussion --- p.47 / Chapter 5 --- RL-Bascd Portfolio Management (RLPM) --- p.49 / Chapter 5.1 --- Overview --- p.52 / Chapter 5.2 --- Two-Asset RL System --- p.54 / Chapter 5.2.1 --- State Formation --- p.55 / Chapter 5.2.2 --- Action --- p.61 / Chapter 5.2.3 --- Update Rule --- p.64 / Chapter 5.3 --- Portfolio Construction --- p.67 / Chapter 5.4 --- Choice of Window Size w --- p.70 / Chapter 5.5 --- Empirical Results --- p.73 / Chapter 5.5.1 --- "Effect of Window Size w on 1 Layer of RLPMw, and 2 Layers of RLPMW" --- p.76 / Chapter 5.5.2 --- Comparing RLPM to Other Strategies --- p.80 / Chapter 5.5.3 --- Effect of Transaction Cost A on RLPMw --- p.83 / Chapter 6 --- Conclusion --- p.89 / Bibliography --- p.94 Portfolio management--Data processing Reinforcement learning
345	A framework for measuring organizational information security vulnerability Zhang, Changli 30 October 2019 (has links) In spite of the ever-growing technology in information security, organizations are still vulnerable to security attacks due to mistakes made by their employees. To evaluate organizational security vulnerability and keep organizations alert on their security situation, in this dissertation, we developed a framework for measuring the security vulnerability of organizations based on online behaviours analysis of their employees. In this framework, the behavioural data of employees for their online privacy are taken as input, and the personal vulnerability profiles of them are generated and represented as confusion matrices. Then, by incorporating the personal vulnerability data into the local social network of interpersonal security influence in the workplace, the overall security vulnerability of each organization is evaluated and rated as a percentile value representing its position to all other organizations. Through evaluation with real-world data and simulation, this framework is verified to be both effective and efficient in estimating the actual security vulnerability status of organizations. Besides, a demo application is developed to illustrate the feasibility of this framework in the practice of improving information security for organizations. / Graduate organizational information security confusion matrix PageRank privacy management social network reinforcement learning
346	Are individual differences in language associated with differences in the corticostriatal system? A behavioral and imaging study Lee, Joanna Chen 01 May 2012 (has links) The overall aim of the current research was to investigate the corticostriatal system in developmental language impairment (DLI) at the behavioral and neuroanatomical levels. Two groups of young adults, one with DLI (N = 25) and the other without (N = 23), participated in the behavioral study. A sample of procedural learning and reinforcement learning (RL) tasks was selected. Each task represents a unique aspect of procedural memory, and learning processes during these tasks have been linked, at least partially, to the functionality of the corticostriatal system. Findings showed that individuals with DLI demonstrated relatively poor performance on different aspects of procedural learning and on RL. Correlation results provide further evidence for a close relationship between individual differences in implicit learning and individual differences in language. These results implicate an abnormal corticostriatal system in DLI. In the structural imaging study, two subgroups of participants from the first study, one with DLI (n = 10) and the other without (n = 10), were matched on age, gender, and handedness. Conventional magnetic resonance imaging (MRI) and diffusion tensor imaging (DTI) were used to investigate the subcortical components of the corticostriatal system in individuals with DLI. Results showed pathological enlargement in the bilateral putamen, the right globus pallidus, and the bilateral nucleus accumbens of individuals with DLI. In addition, the DLI group revealed decreased FA in the globus pallidus and in the thalamus, indicating abnormal white matter integrity in the two subcortical regions. These imaging results underpin the behavioral results, showing corticostriatal abnormalities in DLI at both macrostructural and microstructural levels. In addition to subcortical regions, the four cerebral lobes were also included for an exploratory analysis. Findings showed that individuals with DLI had global diffusion abnormalities in cerebral white matters in the absence of volumetric alterations, and these abnormalities were closely associated with impaired language performance. The results support a role of white matter integrity in language function. In conclusion, individuals with DLI have an abnormal corticostriatal system, which may lead to compromise of a wide variety of cognitive learning, including procedural learning, RL, and certain aspects of language learning. corticostriatal system Developmental language impairment procedural learning reinforcement learning Speech and Hearing Science
347	Integrated Pricing and Seat Allowance for Airline Network Revenue Management Mohan, Baskar 11 July 2005 (has links) The airline industry is facing unprecedented challenges in generating sufficient revenues to stay in business. Airlines must capture the greatest revenue yield from every flight by leaving no seats unsold and not over filling the cabin with discount fares. To succeed in doing the above airlines must be able to accurately forecast each of their market segments, manage product andprice availability to maximize revenue and react quickly to competitive changes in the market place. Thus seat inventory control and ticket pricing form the two major tools of revenue management. The focus of this paper is to consolidate the ideas of seats inventory control and pricing in order to maximize the revenues generated by an airline network. A continuous time yield management model for a network with multiple legs, multiple fare classes and dynamic price changes for all fare classes is considered. Each fare class has a set of fares from which the optimal fare is chosen based upon the Minimum Acceptable Fare (MAF) which performs the critical role in the decision process. A machine Learning based algorithm, EMSRa based and EMSRb based algorithm for obtaining dynamic policies for combined pricing and allocation. The algorithms are implemented for a sample network with eight cities, eleven logs, thirty origin-destinations(ODs), three fare classes, three levels of fares in each class and ninety itineraries. Dynamic pricing Heuristics Optimization Reinforcement learning Seat inventory control American Studies Arts and Humanities
348	A Learning Approach To Obtain Efficient Testing Strategies In Medical Diagnosis Fakih, Saif 15 March 2004 (has links) Determining the most efficient use of diagnostic tests is one of the complex issues facing the medical practitioners. It is generally accepted that excessive use of tests is common practice in medical diagnosis. Many tests are performed even though the incremental knowledge gained does not affect the course of diagnosis. With the soaring cost of healthcare in the US, there is a critical need for cutting costs of diagnostic tests, while achieving a higher level of diagnostic accuracy. Various decision making tools assisting physicians in diagnosis management have been presented to the literature. One such method, called analytical hierarchy process, utilize a multilevel structure of decision criterion for sequential pair wise comparison of available test choices. Many of the decision-analytic methods are based on Bayes' theory and decision trees. These methods use threshold treatment probabilities and performance characteristics of the tests, such as true-positive rate and false-positive rates, to choose among the available alternatives. Sequential testing approaches tend to elongate the diagnosis process, whereas the parallel testing approach generally involves higher number of tests. This research is focused on developing a machine learning based methodology for finding an efficient testing strategy for medical diagnosis. The method, based on the patient parameters (both observed and tested), recommends test(s) with the objective of optimizing a measure of performance for the diagnosis process. The performance measure is a combined cost of the testing, the risk and discomfort associated with the tests and the time taken to reach diagnosis. The performance measure also considers the diagnostic ability of the tests. The methodology is developed combining tools from the fields of data mining (rough set theory, in particular), utility theory, Markov decision processes (MDP), and reinforcement learning (RL). The rough set theory is used in extracting diagnostic information in the form of rules from the medical databases. Utility theory is used to bring three non-homogenous measures (cost of testing, risk and discomfort and diagnostic ability) into one cost based measure of performance. The MDP framework along with an RL algorithm facilitates obtaining efficient testing strategies. The methodology is implemented on a sample problem of diagnosing Solitary Pulmonary Nodule (SPN). The results obtained are compared with those from four other approaches. It is shown that the RL based methodology holds significant promise in improving the performance of diagnostic process. rough sets reinforcement learning solitary pulmonary nodule markov decision process American Studies Arts and Humanities
349	A Model for Strategic Bidding in Combined Transmission and Wholesale Energy Markets Gupte, Sanket 01 July 2004 (has links) Motivated by deregulation in major service sectors like airlines, banking and telecommunication, the electric industry is undergoing a major transformation. However due to design inefficiencies, restructuring of the power sector, so far, has not been a major success. A lack of comprehensive quantitative models has resulted in the inability of the market designers to evaluate market performance and develop successful market designs. A comprehensive model should include market features like two-settlement system, transmission congestion, financial transmission rights (FTRs), demand elasticity, demand-side bidding and other market rules. The contribution of this thesis includes development of an exhaustive modeling framework that includes the above mentioned market features and also development of a computationally effective solution methodology. The market designers would use this methodology in the development of alternative conceptual market design frameworks, and also for assessing the impact of various market rules on market performance. The noncooperative bidding behavior of the generators in both FTR and energy markets are modeled as nonzero-sum stochastic games. Since the bidding strategies in the FTR and energy games are dependent on each other and jointly impact the market performance, a two-tier learning approach is developed. Players (e.g. generators) first bid in the FTR market. FTR bids are then taken into account in the process of selecting bids in the energy market. The FTR bids and the energy bids together decide the market equilibrium and the resulting performance. This performance measure is then used to evaluate success of FTR bidding strategy. Several example power networks are studied to expose the modeling and learning based solution approach. restructuring market power reinforcement learning FTR stochastic games American Studies Arts and Humanities
350	Learning Average Reward Irreducible Stochastic Games: Analysis and Applications Li, Jun, 13 November 2003 (has links) A large class of sequential decision making problems under uncertainty with multiple competing decision makers/agents can be modeled as stochastic games. Stochastic games having Markov properties are called Markov games or competitive Markov decision processes. This dissertation presents an approach to solve non cooperative stochastic games, in which each decision maker makes her/his own decision independently and each has an individual payoff function. In stochastic games, the environment is nonstationary and each agent's payoff is affected by joint decisions of all agents, which results in the conflict of interest among the decision makers. In this research, the theory of Markov decision processes (MDPs) is combined with the game theory to analyze the structure of Nash equilibrium for stochastic games. In particular, the Laurent series expansion technique is used to extend the results of discounted reward stochastic games to average reward stochastic games. As a result, auxiliary matrix games are developed that have equivalent equilibrium points and values to a class of stochastic games that are irreducible and have average reward performance metric. R-learning is a well known machine learning algorithm that deals with average reward MDPs. The R-learning algorithm is extended to develop a Nash-R reinforcement learning algorithm for obtaining the equivalent auxiliary matrices. A convergence analysis of the Nash-R algorithm is developed from the study of the asymptotic behavior of its two time scale stochastic approximation scheme, and the stability of the associated ordinary differential equations (ODEs). The Nash-R learning algorithm is tested and then benchmarked with MDP based learning methods using a well known grid game. Subsequently, a real life application of stochastic games in deregulated power market is explored. According to the current literature, Cournot, Bertrand, and Supply Function Equilibrium (SFEs) are the three primary equilibrium models that are used to evaluate the power market designs. SFE is more realistic for pool type power markets. However, for a complicated power system, the convex assumption for optimization problems is violated in most cases, which makes the problems more difficult to solve. The SFE concept in adopted in this research, and the generators' behaviors are modeled as a stochastic game instead of one shot game. The power market is considered to have features such as multi-settlement (bilateral, day-ahead market, spot markets and transmission congestion contracts), and demand elasticity. Such a market consisting of multiple competing suppliers (generators) is modeled as a competitive Markov decision processes and is studied using the Nash-R algorithm. Markov decision processes game theory reinforcement learning stochastic approximation power market American Studies Arts and Humanities

Search results