1 |
Bayesian Model Selection for Spatial Data and Cost-constrained ApplicationsPorter, Erica May 03 July 2023 (has links)
Bayesian model selection is a useful tool for identifying an appropriate model class, dependence structure, and valuable predictors for a wide variety of applications. In this work we consider objective Bayesian model selection where no subjective information is available to inform priors on model parameters a priori, specifically in the case of hierarchical models for spatial data, which can have complex dependence structures. We develop an approach using trained priors via fractional Bayes factors where standard Bayesian model selection methods fail to produce valid probabilities under improper reference priors. This enables researchers to concurrently determine whether spatial dependence between observations is apparent and identify important predictors for modeling the response. In addition to model selection with objective priors on model parameters, we also consider the case where the priors on the model space are used to penalize individual predictors a priori based on their costs. We propose a flexible approach that introduces a tuning parameter to cost-penalizing model priors that allows researchers to control the level of cost penalization to meet budget constraints and accommodate increasing sample sizes. / Doctor of Philosophy / Spatial data, such as data collected over a geographic region, is relevant in many fields. Spatial data can require complex models to study, but use of these models can impose unnecessary computations and increased difficulty for interpretation when spatial dependence is weak or not present. We develop a method to simultaneously determine whether a spatial model is necessary to understand the data and choose important variables associated with the outcome of interest. Within a class of simpler, linear models, we propose a technique to identify important variables associated with an outcome when there exists a budget or general desire to minimize the cost of collecting the variables.
|
2 |
Empirical Bayes Model Averaging in the Presence of Model MisfitWang, Junyan January 2016 (has links)
No description available.
|
3 |
Empirical Analysis of User Passwords across Online ServicesWang, Chun 05 June 2018 (has links)
Leaked passwords from data breaches can pose a serious threat if users reuse or slightly modify the passwords for other services. With more and more online services getting breached today, there is still a lack of large-scale quantitative understanding of the risks of password reuse and modification. In this project, we perform the first large-scale empirical analysis of password reuse and modification patterns using a ground-truth dataset of 28.8 million users and their 61.5 million passwords in 107 services over 8 years. We find that password reuse and modification is a very common behavior (observed on 52% of the users). More surprisingly, sensitive online services such as shopping websites and email services received the most reused and modified passwords. We also observe that users would still reuse the already-leaked passwords for other online services for years after the initial data breach. Finally, to quantify the security risks, we develop a new training-based guessing algorithm. Extensive evaluations show that more than 16 million password pairs (30% of the modified passwords and all the reused passwords) can be cracked within just 10 guesses. We argue that more proactive mechanisms are needed to protect user accounts after major data breaches. / Master of Science / Since most of the internet services use text-based passwords for user authentication, the leaked passwords from data breaches pose a serious threat, especially if users reuse or slightly modify the passwords for other services. The attacker can leverage a known password from one site to guess the same user’s passwords at other sites more easily. In this project, we perform the first large-scale study of password usage based on the largest ever leaked password dataset. The dataset consists of 28.8 million users and their 61.5 million passwords from 107 internet services over 8 years. We find that password reuse and modification is a very common behavior (observed on 52% of the users). More surprisingly, we find that sensitive online services such as shopping websites and email services received the most reused and modified passwords. In addition, users would still reuse the already-leaked passwords for other online services for years after the initial data breach. Finally, we develop a cross-site password-guessing algorithm to guess the modified passwords based on one of the user’s leaked passwords. Our password guessing experiments show that 30% of the modified passwords can be cracked within only 10 guesses. Therefore, we argue that more proactive mechanisms are needed to protect user accounts after major data breaches.
|
4 |
Fishing Economic Growth Determinants Using Bayesian Elastic NetsHofmarcher, Paul, Crespo Cuaresma, Jesus, Grün, Bettina, Hornik, Kurt 09 1900 (has links) (PDF)
We propose a method to deal simultaneously with model uncertainty and correlated regressors in linear regression models by combining elastic net specifications with a spike and slab prior. The estimation method nests ridge regression and the LASSO estimator and thus allows for a more flexible modelling framework than existing model averaging procedures. In particular, the proposed technique has clear advantages when dealing with datasets of (potentially highly) correlated regressors, a pervasive characteristic of the model averaging datasets used hitherto in the econometric literature. We apply our method to the dataset of economic growth determinants by Sala-i-Martin et al. (Sala-i-Martin, X., Doppelhofer, G., and Miller, R. I. (2004). Determinants of Long-Term Growth: A Bayesian Averaging of Classical Estimates (BACE) Approach. American Economic Review, 94: 813-835) and show that our procedure has superior out-of-sample predictive abilities as compared to the standard Bayesian model averaging methods currently used in the literature. (authors' abstract) / Series: Research Report Series / Department of Statistics and Mathematics
|
5 |
Chinese Stock Markets: Underperformance and its Determinants / Chinese Stock Markets: Underperformance and its DeterminantsKováč, Roman January 2015 (has links)
Performance of stock markets is determined by three classes of variables: macroeconomic indicators, industry & firm heterogeneity and third country effects. When assessing performance of a stock market index, impact of industry & firm heterogeneity is marginal as it is already embedded in the index through its constituent companies. This paper will therefore focus on the other two. Chinese stock market was selected as an application as their performance compared to other domestic indicators (mainly GDP growth) is considered inferior by many researchers. Using econometric framework for panel data and a Bayesian extension, the paper estimates multiple models of Chinese stock market performance examining individual determinants of it. Subsequently, it predicts development of theoretical prices of two main Chinese stock indices on two time samples until 2013. The paper then demonstrates underperformance of Chinese stock market by comparing the modeled prices to actual prices realized on the market. JEL Classification C23, C51, C53, G15, G17 Keywords underperformance, panel data, fixed effects model, Bayesian Model Averaging Author's e-mail roman_kovac@ymail.com Supervisor's e-mail karel.bata@seznam.cz
|
6 |
Determinants of Economic Growth: A Bayesian Model AveragingKudashvili, Nikoloz January 2013 (has links)
MASTER THESIS Determinants of Economic Growth: A Bayesian Model Averaging Author: Bc. Nikoloz Kudashvili Abstract The paper estimates the economic growth determinants across 72 countries using a Bayesian Model Averaging. Unlike the other studies we include debt to GDP ratio as an explanatory variable among 29 growth determinants. For given values of the other variables debt to GDP ratio up to the threshold level is positively related with the growth rate. The coefficient on the ratio has nearly 0.8 posterior inclusion probability suggesting that debt to GDP ratio is an important long term growth determinant. We find that the initial level of GDP, life expectancy and equipment investments have a strong effect on the GDP per capita growth rate together with the debt to GDP ratio.
|
7 |
What Makes a Good Feature?Richards, W., Jepson, A. 01 April 1992 (has links)
Using a Bayesian framework, we place bounds on just what features are worth computing if inferences about the world properties are to be made from image data. Previously others have proposed that useful features reflect "non-accidental'' or "suspicious'' configurations (such as parallel or colinear lines). We make these notions more precise and show them to be context sensitive.
|
8 |
The economic determinants of entrepreneurial activity : evidence from a Bayesian approach : a thesis presented in partial fulfilment of the requirements for the degree of Master of Business Studies in Financial Economics at Massey UniversityWinata, Sherly January 2008 (has links)
In this paper we investigate the economic, political, institutional, and societal factors that encourage entrepreneurial activity. We do so by applying Bayesian Model Averaging, which controls for model uncertainty, to a panel data set for 33 countries. Our results indicate that the general state of macroeconomic activity, the availability of financing, the level of human capital, fiscal policies implemented and the type of economic system are the main determinants of the level of entrepreneurship. We also document a non-linear, U-shaped relation between distortionary taxation and entrepreneurial activity. Keywords: Entrepreneurship, Entrepreneurial Activity, Total Early-Stage Activity (TEA), Global Entrepreneurial Monitor (GEM), Bayesian Model Averaging (BMA), Panel Estimation. JEL Classification: B30, B53, C11, C23, J20, M13, O10, O40
|
9 |
[en] DYNAMIC BAYESIAN MODEL FOR A TRUNCATED NORMAL / [pt] MODELO DINÂMICO BAYESIANO PARA A DENSIDADE NORMAL TRUNCADAMONICA BARROS 08 May 2006 (has links)
[pt] Nesta tese desenvolvemos um Modelo Dinâmico Bayesiano para
a densidade Normal Truncada. A estimação clássica e
estática de observações desta densidade foi desenvolvida
por A.C. Cohen nas décadas de 1950 e 1960, enquanto R. C.
Souza apresentou, em 1978, um modelo dinâmico Bayesiano
para esta densidade, no qual utilizava idéias da Teoria de
Informação. O presente trabalho estende a formulação
dinâmica Bayesiana de West, Harrinson e Migon por tratar
de observações que não pertencem à família exponencial. Ao
mesmo tempo, estendemos os resultados de Souza por não
mais supor a estacionariedade da série. Algumas séries
reais e simuladas são analisadas e, em particular,
comparamos nossos resultados com aqueles obtidos por Souza. / [en] This thesis describes a Dynamic Bayesian Model for a
Truncated Normal distribution. The classical and static
solution to the problem of finding estimators for the
parameters of the original Normal distribution was treated
by A.C. Cohen in the 1950s and 1960s R.C. Souza (1978)
described in his Doctoral thesis a Dynamic Bayesian Model
for this distribution, in which Information Theory
concepts were used. The present thesis extends the dynamic
formulation of West, Harrison and Migon by considering a
distribution which is not a member of a an Exponential
Family. Moreover, we extend the results derived by Souza
by dropping the assumptions of a steady state model. Some
real and simulated series are analyzed and, in particular,
we compare our results with those obtained by souza.
|
10 |
[en] LINEAR GROWTH BAYESIAN MODEL USING DISCOUNT FACTORS / [pt] MODELO BAYESIANO DE CRESCIMENTO LINEAR COM DESCONTOSCRISTIANO AUGUSTO COELHO FERNANDES 17 November 2006 (has links)
[pt] O objetivo principal desta dissertação é descrever e
discutir o Modelo Bayesiano de Crescimento Linear Sazonal,
formulação Estados múltiplos, utilizando descontos. As
idéias originais deste modelo foram desenvolvidas por
Ameen e Harrison. Na primeira parte do trabalho (capítulos
2 e 3) apresentamos idéias bem gerais sobre Séries
Temporais e os principais modelos da literatura. A segunda
parte (capítulos 4, 5 e 6) é dedicada à Estatística
Bayesiana (conceitos gerais), ao MDL na sua formulação
original, e ao nosso modelo de interesse. São apresentadas
algumas sugestões operacionais e um fluxograma de operação
do modelo, com vistas a uma futura implementação
computacional. / [en] The aim of this thesis is to discuss in details the
Multiprocess Linear Grawth Bayesian Model for seasonal
and/or nonseasonal series, using discount factors. The
original formulation of this model was put forward
recently by Ameen and Harrison. In the first part of the
thesis (chapters 2 and 3) we show some general concepts
related to time series and time series modelling, whereas
in the second (chapters 4, 5 and 6) we formally
presented / the Bayesian formulation of the proposed
model. A flow chart and some optional parameter setings
aiming a computational implementation is also presented.
|
Page generated in 0.0516 seconds