Spelling suggestions: "subject:"bayesian logistic regression"" "subject:"eayesian logistic regression""
1 |
Modeling the NCAA Tournament Through Bayesian Logistic RegressionNelson, Bryan 18 July 2012 (has links)
Many rating systems exist that order the Division I teams in Men's College Basketball that compete in the NCAA Tournament, such as seeding teams on an S-curve, and the Pomeroy and Sagarin ratings, simplifying the process of choosing winners to a comparison of two numbers. Rather than creating a rating system, we analyze each matchup by using the difference between the teams' individual regular season statistics as the independent variables. We use an MCMC approach and logistic regression along with several model selection techniques to arrive at models for predicting the winner of each game. When given the 63 actual games in the 2012 tournament, eight of our models performed as well as Pomeroy's rating system and four did as well as Sagarin's rating system when given the 63 actual games. Not allowing the models to fix their mistakes resulted in only one model outperforming both Pomeroy and Sagarin's systems. / McAnulty College and Graduate School of Liberal Arts / Computational Mathematics / MS / Thesis
|
2 |
Identification de la zone regardée sur un écran d'ordinateur à partir du flouNéron, Eric January 2017 (has links)
Quand vient le temps de comprendre le comportement d’une personne, le regard est une source d’information importante. L’analyse des comportements des consommateurs, des criminels, ou encore de certains états cognitifs passe par l’interprétation du regard dans une scène à travers le temps. Il existe un besoin réel d’identification de la zone regardée sur un écran ou tout autre médium par un utilisateur. Pour cela, la vision humaine fait la composition de plusieurs images pour permettre de comprendre la relation tridimensionnelle qui existe entre les objets et la scène. La perception 3D d’une scène réelle passe alors à travers plusieurs images. Mais qu’en est-il lorsqu’il n’y a qu’une seule image ?
|
3 |
A Bayesian Approach to Missile ReliabilityRedd, Taylor Hardison 01 June 2011 (has links) (PDF)
Each year, billions of dollars are spent on missiles and munitions by the United States government. It is therefore vital to have a dependable method to estimate the reliability of these missiles. It is important to take into account the age of the missile, the reliability of different components of the missile, and the impact of different launch phases on missile reliability. Additionally, it is of importance to estimate the missile performance under a variety of test conditions, or modalities. Bayesian logistic regression is utilized to accurately make these estimates. This project presents both previously proposed methods and ways to combine these methods to accurately estimate the reliability of the Cruise Missile.
|
4 |
The influence of probability of detection when modeling species occurrence using GIS and survey dataWilliams, Alison Kay 12 April 2004 (has links)
I compared the performance of habitat models created from data of differing reliability. Because the reliability is dependent on the probability of detecting the species, I experimented to estimate detectability for a salamander species. Based on these estimates, I investigated the sensitivity of habitat models to varying detectability.
Models were created using a database of amphibian and reptile observations at Fort A.P. Hill, Virginia, USA. Performance was compared among modeling methods, taxa, life histories, and sample sizes. Model performance was poor for all methods and species, except for the carpenter frog (Rana virgatipes). Discriminant function analysis and ecological niche factor analysis (ENFA) predicted presence better than logistic regression and Bayesian logistic regression models. Database collections of observations have limited value as input for modeling because of the lack of absence data. Without knowledge of detectability, it is unknown whether non-detection represents absence.
To estimate detectability, I experimented with red-backed salamanders (Plethodon cinereus) using daytime, cover-object searches and nighttime, visual surveys. Salamanders were maintained in enclosures (n = 124) assigned to four treatments, daytime__low density, daytime__high density, nighttime__low density, and nighttime__high density. Multiple observations of each enclosure were made. Detectability was higher using daytime, cover-object searches (64%) than nighttime, visual surveys (20%). Detection was also higher in high-density (49%) versus low-density enclosures (35%).
Because of variation in detectability, I tested model sensitivity to the probability of detection. A simulated distribution was created using functions relating habitat suitability to environmental variables from a landscape. Surveys were replicated by randomly selecting locations (n = 50, 100, 200, or 500) and determining whether the species was observed, based on the probability of detection (p = 40%, 60%, 80%, or 100%). Bayesian logistic regression and ENFA models were created for each sample. When detection was 80 __ 100%, Bayesian predictions were more correlated with the known suitability and identified presence more accurately than ENFA.
Probability of detection was variable among sampling methods and effort. Models created from presence/absence data were sensitive to the probability of detection in the input data. This stresses the importance of quantifying detectability and using presence-only modeling methods when detectability is low. If planning for sampling as an input for suitability modeling, it is important to choose sampling methods to ensure that detection is 80% or higher. / Ph. D.
|
5 |
Bayesian Logistic Regression Model for Siting Biomass-using FacilitiesHuang, Xia 01 December 2010 (has links)
Key sources of oil for western markets are located in complex geopolitical environments that increase economic and social risk. The amalgamation of economic, environmental, social and national security concerns for petroleum-based economies have created a renewed emphasis on alternative sources of energy which include biomass. The stability of sustainable biomass markets hinges on improved methods to predict and visualize business risk and cost to the supply chain.
This thesis develops Bayesian logistic regression models, with comparisons of classical maximum likelihood models, to quantify significant factors that influence the siting of biomass-using facilities and predict potential locations in the 13-state Southeastern United States for three types of biomass-using facilities. Group I combines all biomass-using mills, biorefineries using agricultural residues and wood-using bioenergy/biofuels plants. Group II included pulp and paper mills, and biorefineries that use agricultural and wood residues. Group III included food processing mills and biorefineries that use agricultural and wood residues. The resolution of this research is the 5-digit ZIP Code Tabulation Area (ZCTA), and there are 9,416 ZCTAs in the 13-state Southeastern study region.
For both classical and Bayesian approaches, a training set of data was used plus a separate validation (hold out) set of data using a pseudo-random number-generating function in SAS® Enterprise Miner. Four predefined priors are constructed. Bayesian estimation assuming a Gaussian prior distribution provides the highest correct classification rate of 86.40% for Group I; Bayesian methods assuming the non-informative uniform prior has the highest correct classification rate of 95.97% for Group II; and Bayesian methods assuming a Gaussian prior gives the highest correct classification rate of 92.67% for Group III. Given the comparative low sensitivity for Group II and Group III, a hybrid model that integrates classification trees and local Bayesian logistic regression was developed as part of this research to further improve the predictive power. The hybrid model increases the sensitivity of Group II from 58.54% to 64.40%, and improves both of the specificity and sensitivity significantly for Group III from 98.69% to 99.42% and 39.35% to 46.45%, respectively. Twenty-five optimal locations for the biomass-using facility groupings at the 5-digit ZCTA resolution, based upon the best fitted Bayesian logistic regression model and the hybrid model, are predicted and plotted for the 13-state Southeastern study region.
|
6 |
Inkrementell responsanalys : Vilka kunder bör väljas vid riktad marknadsföring? / Incremental response analysis : Which customers should be selected in direct marketing?Karlsson, Jonas, Karlsson, Roger January 2013 (has links)
If customers respond differently to a campaign, it is worthwhile to find those customers who respond most positively and direct the campaign towards them. This can be done by using so called incremental response analysis where respondents from a campaign are compared with respondents from a control group. Customers with the highest increased response from the campaign will be selected and thus may increase the company’s return. Incremental response analysis is applied to the mobile operator Tres historical data. The thesis intends to investigate which method that best explain the incremental response, namely to find those customers who give the highest incremental response of Tres customers, and what characteristics that are important.The analysis is based on various classification methods such as logistic regression, Lassoregression and decision trees. RMSE which is the root mean square error of the deviation between observed and predicted incremental response, is used to measure the incremental response prediction error. The classification methods are evaluated by Hosmer-Lemeshow test and AUC (Area Under the Curve). Bayesian logistic regression is also used to examine the uncertainty in the parameter estimates.The Lasso regression performs best compared to the decision tree, the ordinary logistic regression and the Bayesian logistic regression seen to the predicted incremental response. Variables that significantly affect the incremental response according to Lasso regression are age and how long the customer had their subscription.
|
7 |
Modélisation incrémentale par méthode bayésienneRosamont Prombo, Kevin 03 1900 (has links)
Les modèles incrémentaux sont des modèles statistiques qui ont été développés initialement dans le domaine du marketing. Ils sont composés de deux groupes, un groupe contrôle et un groupe traitement, tous deux comparés par rapport à une variable réponse binaire (le choix de réponses est « oui » ou « non »). Ces modèles ont pour but de détecter l’effet du traitement sur les individus à l’étude. Ces individus n’étant pas tous des clients, nous les appellerons : « prospects ». Cet effet peut être négatif, nul ou positif selon les caractéristiques des individus composants les différents groupes.
Ce mémoire a pour objectif de comparer des modèles incrémentaux d’un point de vue bayésien et d’un point de vue fréquentiste. Les modèles incrémentaux utilisés en pratique sont ceux de Lo (2002) et de Lai (2004). Ils sont initialement réalisés d’un point de vue fréquentiste. Ainsi, dans ce mémoire, l’approche bayésienne est utilisée et comparée à l’approche fréquentiste. Les simulations sont e ectuées sur des données générées avec des régressions logistiques. Puis, les paramètres de ces régressions sont estimés avec des simulations Monte-Carlo dans l’approche bayésienne et comparés à ceux obtenus dans l’approche fréquentiste. L’estimation des paramètres a une influence directe sur la capacité du modèle à bien prédire l’effet du traitement sur les individus.
Nous considérons l’utilisation de trois lois a priori pour l’estimation des paramètres de façon bayésienne. Elles sont choisies de manière à ce que les lois a priori soient non informatives. Les trois lois utilisées sont les suivantes : la loi bêta transformée, la loi Cauchy et la loi normale.
Au cours de l’étude, nous remarquerons que les méthodes bayésiennes ont un réel impact positif sur le ciblage des individus composant les échantillons de petite taille. / Uplift modelling is a statistical method initially developed in marketing. It has two groups (a control group and a treatment group) that are compared using a binary response variable (the response can be « yes » or « no »). The goal of this model is to detect the treatment e ect on prospects. This e ect can be either negative, null or positive. It depends on characteristics of each individual in each group.
The purpose of this master thesis is to compare the Bayesian point of view with the frequentist one on uplift modelling. The uplift models used in this thesis are Lo model (2002) and Lai model (2004). Both of them are originally modeled using the frequentist point of view. Therefore, the Bayesian approach is modeled and compared to the frequentist one. Simulations are done on generated data from logistic regressions. Then regression parameters are estimated with Monte- Carlo simulations for Bayesian approach. They are then compared to parameter estimations from the frequentist approach. Parameter estimations have direct influences on the ability of the modelling to predict treatment e ect on individual. Three priors are considered for the Bayesian estimation of the parameters. These densities are chosen such that they are non-informative. They are the following : transformed beta, Cauchy and normal.
In the course of the study, we will notice the Bayesian method has a real positive impact on targeting individual from the small size sample.
|
Page generated in 0.3163 seconds