• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 403
  • 124
  • 56
  • 51
  • 43
  • 25
  • 15
  • 15
  • 13
  • 11
  • 8
  • 8
  • 7
  • 7
  • 7
  • Tagged with
  • 919
  • 189
  • 173
  • 160
  • 101
  • 93
  • 85
  • 84
  • 80
  • 72
  • 68
  • 65
  • 62
  • 61
  • 60
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
451

Prosocial Influences on Nicotine Reinforcement, Reward, and Neural Signaling in Rodent Models

January 2015 (has links)
abstract: Social influences are important determinants of drug initiation in humans, particularly during adolescence and early adulthood. My dissertation tested three hypotheses: 1) conditioned and unconditioned nicotine and social rewards elicit unique patterns of neural signaling in the corticolimbic neurocircuitry when presented in combination versus individually; 2) play behavior is not necessary for expression of social reward; and 3) social context enhances nicotine self-administration. To test the first hypothesis, Fos protein was measured in response to social and nicotine reward stimuli given alone or in combination and in response to environmental cues associated with the rewards in a conditioned place preference (CPP) test. Social-conditioned environmental stimuli attenuated Fos expression in the nucleus accumbens core. A social partner elevated Fos expression in the caudate-putamen, medial and central amygdala, and both nucleus accumbens subregions. Nicotine decreased Fos expression in the cingulate cortex, caudate-putamen, and the nucleus accumbens core. Both stimuli combined elevated Fos expression in the basolateral amygdala and ventral tegmental area, suggesting possible overlap in processing both rewards in these regions. I tested the second hypothesis with an apparatus containing compartments separated by a wire mesh barrier that allowed limited physical contact with a rat or object. While 2 pairings with a partner rat (full physical contact) produced robust CPP, additional pairings were needed for CPP with a partner behind a barrier or physical contact with an object (i.e., tennis ball). The results demonstrate that physical contact with a partner rat is not necessary to establish social-reward CPP. I tested the third hypothesis with duplex operant conditioning chambers separated either by a solid or a wire mesh barrier to allow for social interaction during self-administration sessions. Nicotine (0.015 and 0.03 mg/kg, IV) and saline self-administration were assessed in male and female young-adult rats either in the social context or isolation. Initially, a social context facilitated nicotine intake at the low dose in male rats, but suppressed intake in later sessions more strongly in female rats, suggesting that social factors exert strong sex-dependent influences on self-administration. These novel findings highlight the importance of social influences on several nicotine-related behavioral paradigms and associated neurocircuitry. / Dissertation/Thesis / Doctoral Dissertation Psychology 2015
452

Elicitation and planning in Markov decision processes with unknown rewards / Elicitation et planification dans les processus décisionnel de MARKOV avec récompenses inconnues

Alizadeh, Pegah 09 December 2016 (has links)
Les processus décisionnels de Markov (MDPs) modélisent des problèmes de décisionsséquentielles dans lesquels un utilisateur interagit avec l’environnement et adapte soncomportement en prenant en compte les signaux de récompense numérique reçus. La solutiond’unMDP se ramène à formuler le comportement de l’utilisateur dans l’environnementà l’aide d’une fonction de politique qui spécifie quelle action choisir dans chaque situation.Dans de nombreux problèmes de décision du monde réel, les utilisateurs ont despréférences différentes, donc, les gains de leurs actions sur les états sont différents et devraientêtre re-décodés pour chaque utilisateur. Dans cette thèse, nous nous intéressonsà la résolution des MDPs pour les utilisateurs ayant des préférences différentes.Nous utilisons un modèle nommé MDP à Valeur vectorielle (VMDP) avec des récompensesvectorielles. Nous proposons un algorithme de recherche-propagation qui permetd’attribuer une fonction de valeur vectorielle à chaque politique et de caractériser chaqueutilisateur par un vecteur de préférences sur l’ensemble des fonctions de valeur, où levecteur de préférence satisfait les priorités de l’utilisateur. Etant donné que le vecteurde préférences d’utilisateur n’est pas connu, nous présentons plusieurs méthodes pourrésoudre des MDP tout en approximant le vecteur de préférence de l’utilisateur.Nous introduisons deux algorithmes qui réduisent le nombre de requêtes nécessairespour trouver la politique optimale d’un utilisateur: 1) Un algorithme de recherchepropagation,où nous propageons un ensemble de politiques optimales possibles pourle MDP donné sans connaître les préférences de l’utilisateur. 2) Un algorithme interactifd’itération de la valeur (IVI) sur les MDPs, nommé algorithme d’itération de la valeurbasé sur les avantages (ABVI) qui utilise le clustering et le regroupement des avantages.Nous montrons également comment l’algorithme ABVI fonctionne correctement pourdeux types d’utilisateurs différents: confiant et incertain.Nous travaillons finalement sur une méthode d’approximation par critére de regret minimaxcomme méthode pour trouver la politique optimale tenant compte des informationslimitées sur les préférences de l’utilisateur. Dans ce système, tous les objectifs possiblessont simplement bornés entre deux limites supérieure et inférieure tandis que le systèmeine connaît pas les préférences de l’utilisateur parmi ceux-ci. Nous proposons une méthodeheuristique d’approximation par critère de regret minimax pour résoudre des MDPsavec des récompenses inconnues. Cette méthode est plus rapide et moins complexe queles méthodes existantes dans la littérature. / Markov decision processes (MDPs) are models for solving sequential decision problemswhere a user interacts with the environment and adapts her policy by taking numericalreward signals into account. The solution of an MDP reduces to formulate the userbehavior in the environment with a policy function that specifies which action to choose ineach situation. In many real world decision problems, the users have various preferences,and therefore, the gain of actions on states are different and should be re-decoded foreach user. In this dissertation, we are interested in solving MDPs for users with differentpreferences.We use a model named Vector-valued MDP (VMDP) with vector rewards. We propose apropagation-search algorithm that allows to assign a vector-value function to each policyand identify each user with a preference vector on the existing set of preferences wherethe preference vector satisfies the user priorities. Since the user preference vector is notknown we present several methods for solving VMDPs while approximating the user’spreference vector.We introduce two algorithms that reduce the number of queries needed to find the optimalpolicy of a user: 1) A propagation-search algorithm, where we propagate a setof possible optimal policies for the given MDP without knowing the user’s preferences.2) An interactive value iteration algorithm (IVI) on VMDPs, namely Advantage-basedValue Iteration (ABVI) algorithm that uses clustering and regrouping advantages. Wealso demonstrate how ABVI algorithm works properly for two different types of users:confident and uncertain.We finally work on a minimax regret approximation method as a method for findingthe optimal policy w.r.t the limited information about user’s preferences. All possibleobjectives in the system are just bounded between two higher and lower bounds while thesystem is not aware of user’s preferences among them. We propose an heuristic minimaxregret approximation method for solving MDPs with unknown rewards that is faster andless complex than the existing methods in the literature.
453

Aprendizado por reforço multiagente : uma avaliação de diferentes mecanismos de recompensa para o problema de aprendizado de rotas / Multiagent reinforcement learning : an evaluation of different reward mechanisms for the route learning problem

Grunitzki, Ricardo January 2014 (has links)
Esta dissertação de mestrado apresenta um estudo sobre os efeitos de diferentes funções de recompensa, aplicadas em aprendizado por reforço multiagente, para o problema de roteamento de veículos, em redes de tráfego. São abordadas duas funções de recompensas que diferem no alinhamento do sinal numérico enviado do ambiente ao agente. A primeira função, chamada função individual, é alinhada à utilidade individual do agente (veículo ou motorista) e busca minimizar seu tempo de viagem. Já a segunda função, por sua vez, é a chamada difference rewards, essa é alinhada à utilidade global do sistema e tem por objetivo minimizar o tempo médio de viagem na rede (tempo médio de viagem de todos os motoristas). Ambas as abordagens são aplicadas em dois cenários de roteamento de veículos que diferem em: quantidade de motoristas aprendendo, topologia e, consequentemente, nível de complexidade. As abordagens são comparadas com três técnicas de alocação de tráfego presentes na literatura. Resultados apontam que os métodos baseados em aprendizado por reforço apresentam desempenho superior aos métodos de alocação de rotas. Além disso, o alinhamento da função de recompensa à utilidade global proporciona uma melhora significativa nos resultados quando comparados com a função individual. Porém, para o cenário com maior quantidade de agentes aprendendo simultaneamente, ambas as abordagens apresentam soluções equivalentes. / This dissertation presents a study on the effects of different reward functions applyed to multiagent reinforcement learning, for the vehicles routing problem, in traffic networks. Two reward functions that differ in the alignment of the numerical signal sent from the environment to the agent are addressed. The first function, called individual function is aligned with the agent’s (vehicle or driver) utility and seeks to minimize their travel time. The second function, is called difference rewards and is aligned to the system’s utility and aims to minimize the average travel time on the network (average travel time of all drivers). Both approaches are applied to two routing vehicles’ problems, which differ in the number of learning drivers, network topology and therefore, level of complexity. These approaches are compared with three traffic assignment techniques from the literature. Results show that reinforcement learning-based methods yield superior results than traffic assignment methods. Furthermore, the reward function alignment to the global utility, provides a significant improvement in results when compared with the individual function. However, for scenarios with many agents learning simultaneously, both approaches yield equivalent solutions.
454

Perturbing Neural Feedback Loops to Understand the Relationships of Their Parts

January 2014 (has links)
abstract: The basal ganglia are four sub-cortical nuclei associated with motor control and reward learning. They are part of numerous larger mostly segregated loops where the basal ganglia receive inputs from specific regions of cortex. Converging on these inputs are dopaminergic neurons that alter their firing based on received and/or predicted rewarding outcomes of a behavior. The basal ganglia's output feeds through the thalamus back to the areas of the cortex where the loop originated. Understanding the dynamic interactions between the various parts of these loops is critical to understanding the basal ganglia's role in motor control and reward based learning. This work developed several experimental techniques that can be applied to further study basal ganglia function. The first technique used micro-volume injections of low concentration muscimol to decrease the firing rates of recorded neurons in a limited area of cortex in rats. Afterwards, an artificial cerebrospinal fluid flush was injected to rapidly eliminate the muscimol's effects. This technique was able to contain the effects of muscimol to approximately a 1 mm radius volume and limited the duration of the drug effect to less than one hour. This technique could be used to temporarily perturb a small portion of the loops involving the basal ganglia and then observe how these effects propagate in other connected regions. The second part applied self-organizing maps (SOM) to find temporal patterns in neural firing rate that are independent of behavior. The distribution of detected patterns frequency on these maps can then be used to determine if changes in neural activity are occurring over time. The final technique focused on the role of the basal ganglia in reward learning. A new conditioning technique was created to increase the occurrence of selected patterns of neural activity without utilizing any external reward or behavior. A pattern of neural activity in the cortex of rats was selected using an SOM. The pattern was then reinforced by being paired with electrical stimulation of the medial forebrain bundle triggering dopamine release in the basal ganglia. Ultimately, this technique proved unsuccessful possibly due to poor selection of the patterns being reinforced. / Dissertation/Thesis / Ph.D. Bioengineering 2014
455

Efeitos da exposição ao etanol em camundongos adolescentes e adultos: comportamentos relacionados à recompensa, sensibilização comportamental e o papel dos sistemas dopaminérgico e glutamatérgico. / Effects of ethanol exposure in adolescent and adult mice: behaviors related to reward, behavioral sensitization and the role of the dopaminergic and glutamatergic systems.

Priscila Fernandes Carrara do Nascimento 16 September 2011 (has links)
O uso de drogas de abuso inicia-se principalmente na adolescência, um período em que o sistema nervoso central está passando por mudanças maturacionais neuroquímicas e neuroanatômicas. Assim, o objetivo desse projeto foi avaliar efeitos comportamentais e neuroquímicos da exposição ao etanol em camundongos adolescentes e adultos. O presente projeto investigou: o desenvolvimento da sensibilização comportamental ao etanol em camundongos adolescentes e adultos; possíveis alterações na sinalização dopaminérgica e glutamatérgica no núcleo accumbens e estriado de camundongos sensibilizados ao etanol durante a adolescência; participação da via AMPc/PKA na sensibilização comportamental ao etanol em camundongos adolescentes e adultos; efeitos reforçadores do etanol em camundongos adolescentes e adultos por meio do modelo de auto-administração; efeitos reforçadores do etanol utilizando-se o aparelho de preferência condicionada de lugar, que possibilita a quantificação do tempo de permanência do animal no ambiente associado aos efeitos da droga; efeito de intoxicações crônicas seguidas de abstinência no padrão de consumo. / The use of drugs of abuse begins mainly in adolescence, a period in which the central nervous system is undergoing neurochemical and neuroanatomical maturation changes. Thus, the aim of this project was to evaluate the behavioral and neurochemical effects of ethanol exposure in adolescent and adult mice. The present project investigated: the development of behavioral sensitization to ethanol in adolescent and adult mice; possible alterations in the dopaminergic and glutamatergic sinaling in the nucleus accumbens and striatum of mice sensitized to ethanol during adolescence; participation of the AMPc/PKA pathway in the behavioral sensitization to ethanol in adolescent and adult mice; reinforcing effects of ethanol in adolescent and adult mice in a model of self-administration; reinforcing effects of ethanol in a model of conditioned place preference, that allows the quantification of the animal time spent in the environment associated with the drug effects; effect of chronic intoxication followed by withdrawal in the consumption pattern.
456

Aprendizado por reforço multiagente : uma avaliação de diferentes mecanismos de recompensa para o problema de aprendizado de rotas / Multiagent reinforcement learning : an evaluation of different reward mechanisms for the route learning problem

Grunitzki, Ricardo January 2014 (has links)
Esta dissertação de mestrado apresenta um estudo sobre os efeitos de diferentes funções de recompensa, aplicadas em aprendizado por reforço multiagente, para o problema de roteamento de veículos, em redes de tráfego. São abordadas duas funções de recompensas que diferem no alinhamento do sinal numérico enviado do ambiente ao agente. A primeira função, chamada função individual, é alinhada à utilidade individual do agente (veículo ou motorista) e busca minimizar seu tempo de viagem. Já a segunda função, por sua vez, é a chamada difference rewards, essa é alinhada à utilidade global do sistema e tem por objetivo minimizar o tempo médio de viagem na rede (tempo médio de viagem de todos os motoristas). Ambas as abordagens são aplicadas em dois cenários de roteamento de veículos que diferem em: quantidade de motoristas aprendendo, topologia e, consequentemente, nível de complexidade. As abordagens são comparadas com três técnicas de alocação de tráfego presentes na literatura. Resultados apontam que os métodos baseados em aprendizado por reforço apresentam desempenho superior aos métodos de alocação de rotas. Além disso, o alinhamento da função de recompensa à utilidade global proporciona uma melhora significativa nos resultados quando comparados com a função individual. Porém, para o cenário com maior quantidade de agentes aprendendo simultaneamente, ambas as abordagens apresentam soluções equivalentes. / This dissertation presents a study on the effects of different reward functions applyed to multiagent reinforcement learning, for the vehicles routing problem, in traffic networks. Two reward functions that differ in the alignment of the numerical signal sent from the environment to the agent are addressed. The first function, called individual function is aligned with the agent’s (vehicle or driver) utility and seeks to minimize their travel time. The second function, is called difference rewards and is aligned to the system’s utility and aims to minimize the average travel time on the network (average travel time of all drivers). Both approaches are applied to two routing vehicles’ problems, which differ in the number of learning drivers, network topology and therefore, level of complexity. These approaches are compared with three traffic assignment techniques from the literature. Results show that reinforcement learning-based methods yield superior results than traffic assignment methods. Furthermore, the reward function alignment to the global utility, provides a significant improvement in results when compared with the individual function. However, for scenarios with many agents learning simultaneously, both approaches yield equivalent solutions.
457

Aferências hipotalâmicas para a área tegmental ventral, núcleo tegmental rostromedial e núcleo dorsal da rafe. / Hypothalamic afferents to the ventral tegmental area, rostromedial tegmental nucleus and dorsal rafe nucleus.

Leandro Bueno Lima 23 June 2015 (has links)
O hipotálamo modula comportamentos relacionados à motivação, recompensa e punição através de projeções para a área tegmental ventral (VTA), o núcleo dorsal da rafe (DR) e o núcleo tegmental rostromedial (RMTg). Nesse estudo, investigamos através de métodos de rastreamento retrógrado as entradas hipotalâmicas da VTA, do DR e do RMTg e, se neurônios hipotalâmicos individuais inervam mais do que uma dessas regiões. Também determinamos uma possível assinatura GABAérgica ou glutamatérgica das aferências hipotalâmicas, através de rastreamento retrógrado combinado com métodos de hibridação in situ. Observamos que VTA, DR e RMTg recebem um padrão bastante semelhante de entradas hipotalâmicas originando de neurônios de projeção glutamatérgicas e GABAérgicas, a maioria deles (> 90%) inervando somente um desses três alvos. Nossos achados indicam que entradas hipotalâmicas são importantes fontes de sinais homeostáticos para a VTA, o DR e o RMTg. Eles exibem um alto grau de heterogeneidade que permite de excitar ou inibir as três estruturas de forma independente ou em conjunto. / The hypothalamus modulates behaviors related to motivation, reward and punishment via projections to the ventral tegmental area (VTA), dorsal raphe nucleus (DR), and rostromedial tegmental nucleus (RMTg). In this study we investigated by retrograde tracing methods hypothalamic inputs to the VTA, DR, and RMTg, and whether individual hypothalamic neurons project to more than one of these structures. We also determined a possible GABAergic or glutamatergic phenotype of hypothalamic afferents, by combining retrograde tracing with in situ hybridization methods. We found that VTA, DR, and RMTg receive a very similar set of hypothalamic afferents originating from glutamatergic and GABAergic hypothalamic projection neurons, the majority of them (> 90%) only innervating one of these structures. Our findings indicate that hypothalamic inputs are important sources of homeostatic signals for the VTA, DR, and RMTg. They exhibit a high degree of heterogeneity which permits to activate or inhibit the three structures either independently or jointly.
458

Chasing the Greater Good : A Qualitative study in how Sustainability is steered and implemented through Control Systems

Jansson, Andreas, Lind, Ebba January 2018 (has links)
Today's society is a vulnerable society. Climate change and social injustices are genuine threats to how the lives of future generations are going to be, and humans are the cause of these problems. Sustainable development needs to be achieved if we want future generations to have a good and dignified existence. Companies have a significant impact on society, and therefore play an essential role in how successful we are in achieving sustainable development. Past research has found that control systems, especially reward systems, can be important to implement sustainability in companies. Therefore, the purpose of this study is to get a deeper understanding of how organizations implement and steer sustainability by including it in their control systems, and particularly, how reward systems are used in the context of sustainability. By studying past research and theories regarding sustainable development, management control, performance measurements and reward systems, we have gained knowledge and an understanding of how these areas are connected. To achieve the purpose of this study, a qualitative study was done in which we conducted eleven semi-structured interviews with managers and sustainability experts from six different companies. Based on the empirical findings of the study, we were able to make several conclusions. First, sustainability is integrated into already existing control systems, which often interact and are related to each other, and we found that there is a balance between social and environmental dimensions. Companies choose to focus on sustainability issues were they have the most negative impact, and they include sustainability mostly in their cybernetic and administrative controls, for instance through performance measurements (KPIs) and policy documents. Also, several companies incorporate sustainability into their reward systems, such as rewards for safety parameters and reducing CO² emissions. What kind of rewards that are given for sustainability performance differ, some companies give monetary rewards while other give non-monetary. Non-monetary rewards are in the findings presented as most valuable for sustainability. We can also conclude that there are challenges with rewarding sustainability, but many respondents have overcome them and use reward systems as a way of implementing sustainability. Those who do not reward sustainability today can imagine doing so in the near future.
459

Étude des modifications à long terme induites par la prise chronique de cocaïne : approches anatomique, métabolique et comportementale / Long-term alterations induced by chronic cocaine intake revealed by anatomical, metabolic and behavioral approaches

Nicolas, Céline 18 December 2014 (has links)
L'addiction aux drogues est une pathologie psychiatrique chronique qui représente un problème de santé publique majeur. Malgré des avancées importantes permettant de mieux comprendre les modifications cérébrales induites par les drogues d'abus, les thérapies restent encore limitées. Ainsi, l'étude des processus cérébraux qui sous-tendent les risques de rechute à long terme semble être centrale à l'élaboration de nouvelles stratégies thérapeutiques. Une partie de cette thèse vise à déterminer les modifications cérébrales induites à long terme lors du sevrage, suite à une prise chronique de cocaïne. Notre première étude a révélé une réduction de la densité vasculaire cérébrale lors du sevrage précoce, exclusivement localisée dans le cortex cingulaire. Dans la seconde étude, nous avons mis en évidence des modifications du métabolisme cérébral et étudié leur évolution lors du sevrage. Après un mois de sevrage, période où l'on trouve le phénomène d'incubation du « craving », nous avons mis en évidence une réduction du métabolisme cortical et striatal alors que l'amygdale se voit hyperactivée témoignant d'une dérégulation des fonctions cérébrales. Enfin, nous avons caractérisé le mécanisme sous-jacent à l'effet « anti-craving » de l'environnement enrichi (EE). Nous avons émis l'hypothèse que l'EE agirait comme une récompense alternative pour diminuer la recherche de cocaïne, ainsi nous avons testé les effets de l'exposition au sucrose et à l'exercice physique sur le comportement de recherche de drogue. Nous avons montré que l'accès à une récompense alternative pendant le sevrage ne permet pas de réduire la recherche de cocaïne suggérant que l’EE n'agit pas exclusivement comme une récompense alternative. / Drugs addiction is a chronic brain disorder representing a major public health problem. Although important advances allowed a better understanding of the cerebral modifications induced by chronic exposure to drugs, the therapies still nowadays limited. Therefore the investigation of cerebral processes that underlie the persistent risks of relapse, seem to be crucial to offer new therapeutic strategies. A part of this thesis aims at investigating the cerebral modifications induced in a long term during the withdrawal, due to a chronic voluntary intake of cocaine. In our first study we found a reduction of the density of cerebral vessels during the early withdrawal selectively localized in the cingular cortex. In our second study we found that cocaine intake leads to modifications of cerebral metabolism that evolve during the withdrawal. After one month of withdrawal, at a time when the phenomenon of incubation of craving is found, we found a decrease in cortical and striatal metabolism and a hyperactivation of the amygdala which demonstrates a persistent disregulation of brain functioning. Finally, in our third study, we tried to dissect the mechanism underlying the anti-craving effect of the enriched environment (EE). We hypothesized that the EE acts as an alternative reward to decrease the cocaine seeking behavior. Thus we tested the effects of exposure to sucrose or the physical exercise on relapse to cocaine. We demonstrated that the access to an alternative reward during the withdrawal does not allow reducing cocaine seeking which suggests that the EE does not act exclusively as an alternative reward.
460

Reforming a publicly owned monopoly : costs and incentives in railway maintenance

Odolinski, Kristofer January 2015 (has links)
The railway system is often considered to be an industry where a monopoly occurs “naturally”, which can explain the public ownership and the use of regulations. However, railways in Europe have been subject to reforms during the last three decades. The use of tendering has increased, which is a way of introducing competition for the market in absence of competition within the market. Still, contracting out services previously produced in-house places a heavy burden on the client, where contract design and its incentive structures can be decisive for the outcome of the reform. This dissertation provides empirical evidence on costs and incentives in a publicly owned monopoly that is subject to reforms, namely the provision of railway maintenance in Sweden. Essay 1 estimates the effect of exposing rail infrastructure maintenance to competitive tendering. The results show that this reform reduced maintenance costs in Sweden by around 11 per cent over the period 1999-2011, without any associated fall in the available measures of quality. Essay 2 estimates the relative cost efficiency between and within maintenance regions in Sweden. The results indicate considerable efficiency gaps together with economies of scale not being fully exploited. Essay 3 analyses the effect of incentive structures in railway maintenance contracts. An increase in the power of the incentive scheme reduces the number of infrastructure failures according to the results. In addition, the estimated effect of the performance incentive schemes suggests that more effort towards preventing train delays is made at the expense of preventing other failures. Essay 4 comprises an estimation of marginal costs of rail maintenance. The static model produces slightly lower marginal costs compared to previous estimates on Swedish data. The results from the dynamic model show that an increase in maintenance costs in year t - 1 predicts an increase in maintenance costs in year t. Indeed, there is an intertemporal effect that depends on the performed maintenance activities (governed by the contract design).

Page generated in 0.0895 seconds