Global ETD Search

1	Modèles bayésiens d'inférence séquentielle chez l'humain / Bayesian models of human online inference Prat-Carrabin, Arthur 28 November 2017 (has links) Le paradigme bayésien s'est imposé comme une interprétation mathématique élégante du comportement humain dans des tâches d'inférence. Pourtant, il ne rend pas compte de la présence de sous-optimalité, de variabilité, et de biais systématiques chez les humains. De plus, le cerveau doit mettre à jour ses représentations du monde extérieur, au fil des informations qui lui parviennent, dans des environnements naturels qui changent au cours du temps, et présentent une structure temporelle. Nous étudions la question de l'inférence séquentielle à l'aide d'une expérience, dont les résultats montrent que les humains tirent parti, dans leur inférence, de la structure temporelle des signaux; et que la variabilité des réponses est elle-même fonction du processus d'inférence. Nous étudions 27 modèles sous-optimaux capturant des limitations cognitives à l'optimalité. La variabilité des réponses est reproduite par des modèles qui font une approximation, par échantillonnage durant l'inférence, du posterior, et par des modèles qui, dans leur réponse, échantillonnent le posterior, plutôt que de le maximiser. Les données expérimentales soutiennent plus fortement la première hypothèse, suggérant que le cerveau utilise quelques échantillons pour représenter, par approximation, le posterior bayésien. Enfin, nous étudions les "effets séquentiels", biais qui consistent à former des attentes erronées à propos d'un signal aléatoire. Nous supposons que les sujets infèrent les statistiques du signal, mais cette inférence est sujette à un coût cognitif, menant à des comportements non-triviaux. Considérés dans leur ensemble, nos résultats montrent, dans le cas naturel de l'inférence séquentielle, que des déviations du modèle bayésien optimal permettent de rendre compte de manière satisfaisante de la sous-optimalité, de la variabilité, et des biais systématiques constatés chez l'humain. / In past decades, the Bayesian paradigm has gained traction as an elegant and mathematically principled account of human behavior in inference tasks. Yet this success is tainted by the sub-optimality, variability, and systematic biases in human behavior. Besides, the brain must sequentially update its belief as new information is received, in natural environments that, usually, change over time and present a temporal structure. We investigate, with a task, the question of human online inference. Our data show that humans can make use of subtle aspects of temporal statistics in online inference; and that the magnitude of the variability found in responses itself depends on the inference. We investigate how a broad family of models, capturing deviations from optimality based on cognitive limitations, can account for human behavior. The variability in responses is reproduced by models approximating the posterior through random sampling during inference, and by models that select responses by sampling the posterior instead of maximizing it. Model fitting supports the former scenario and suggests that the brain approximates the Bayesian posterior using a small number of random samples. In a last part of our work, we turn to "sequential effects", biases in which human subjects form erroneous expectations about a random signal. We assume that subjects are inferring the statistics of the signal, but this inference is hindered by a cognitive cost, leading to non-trivial behaviors. Taken together, our results demonstrate, in the ecological case of online inference, how deviations from the Bayesian model, based on cognitive limitations, can account for sub-optimality, variability, and biases in human behavior. Inférence Inférence séquentielle Inférence bayésienne Décision Effets séquentiels Neurosciences cognitives Inference Online inference Bayesian inference Decision-making Sequential effects Cognitive neuroscience 616.8
2	ONLINE STATISTICAL INFERENCE FOR LOW-RANK REINFORCEMENT LEARNING Qiyu Han (18284758) 01 April 2024 (has links) <p dir="ltr">We propose a fully online procedure to conduct statistical inference with adaptively collected data. The low-rank structure of the model parameter and the adaptivity nature of the data collection process make this task challenging: standard low-rank estimators are biased and cannot be obtained in a sequential manner while existing inference approaches in sequential decision-making algorithms fail to account for the low-rankness and are also biased. To tackle the challenges previously outlined, we first develop an online low-rank estimation process employing Stochastic Gradient Descent with noisy observations. Subsequently, to facilitate statistical inference using the online low-rank estimator, we introduced a novel online debiasing technique designed to address both sources of bias simultaneously. This method yields an unbiased estimator suitable for parameter inference. Finally, we developed an inferential framework capable of establishing an online estimator for performing inference on the optimal policy value. In theory, we establish the asymptotic normality of the proposed online debiased estimators and prove the validity of the constructed confidence intervals for both inference tasks. Our inference results are built upon a newly developed low-rank stochastic gradient descent estimator and its non-asymptotic convergence result, which is also of independent interest.</p> Context learning Reinforcement learning Statistical data science Online Decision-Making Low-rank matrix estimation Stochastic Gradient Descent (SGD) Reinforcement Learning Online inference

Search results

Modèles bayésiens d'inférence séquentielle chez l'humain / Bayesian models of human online inference

ONLINE STATISTICAL INFERENCE FOR LOW-RANK REINFORCEMENT LEARNING