Global ETD Search

141	Bayesian kernel density estimation Rademeyer, Estian January 2017 (has links) This dissertation investigates the performance of two-class classi cation credit scoring data sets with low default ratios. The standard two-class parametric Gaussian and naive Bayes (NB), as well as the non-parametric Parzen classi ers are extended, using Bayes' rule, to include either a class imbalance or a Bernoulli prior. This is done with the aim of addressing the low default probability problem. Furthermore, the performance of Parzen classi cation with Silverman and Minimum Leave-one-out Entropy (MLE) Gaussian kernel bandwidth estimation is also investigated. It is shown that the non-parametric Parzen classi ers yield superior classi cation power. However, there is a longing for these non-parametric classi ers to posses a predictive power, such as exhibited by the odds ratio found in logistic regression (LR). The dissertation therefore dedicates a section to, amongst other things, study the paper entitled \Model-Free Objective Bayesian Prediction" (Bernardo 1999). Since this approach to Bayesian kernel density estimation is only developed for the univariate and the uncorrelated multivariate case, the section develops a theoretical multivariate approach to Bayesian kernel density estimation. This approach is theoretically capable of handling both correlated as well as uncorrelated features in data. This is done through the assumption of a multivariate Gaussian kernel function and the use of an inverse Wishart prior. / Dissertation (MSc)--University of Pretoria, 2017. / The financial assistance of the National Research Foundation (NRF) towards this research is hereby acknowledged. Opinions expressed and conclusions arrived at, are those of the authors and are not necessarily to be attributed to the NRF. / Statistics / MSc / Unrestricted Kernel density estimation Bayes Credit scoring Machine learning UCTD
142	Essays on crime and education Bruhn, Jesse 10 February 2020 (has links) This dissertation consists of three chapters exploring education and crime in the modern economy. The first two chapters focus on inter-district school choice and teacher labor markets in Massachusetts. The third chapter examines the demolition of public housing in Chicago and its interaction with the geospatial distribution of gang territory. In the first chapter, I study the sorting of students to school districts using new lottery data from an inter-district school choice program. I find that moving to a more preferred school district generates benefits to student test scores, coursework quality, high-school graduation, and college attendance. Motivated by these findings, I develop a rich model of treatment effect heterogeneity and estimate it using a new empirical-Bayes-type procedure that leverages non-experimental data to increase precision in quasi-experimental designs. I use the heterogeneous effects to show that nearly all the test score gains from the choice program emerge from Roy selection. In the second chapter (joint with Scott Imberman and Marcus Winters), we describe the relationship between school quality, teacher value-added, and teacher attrition across the public and charter sectors. We begin by documenting important differences in the sources of variation that explain attrition across sectors. Next we demonstrate that while charters are in fact more likely to remove their worst teachers, they are also more likely to lose their best. We conclude by exploring the type and quality of destination schools among teachers who move. In the third chapter, I study the demolition of 22,000 units of public housing on crime in Chicago. Point estimates that incorporate both the direct and spillover effects indicate that in the short run, the average demolition increased city-wide crime by 0.5% per month relative to baseline, with no evidence of offsetting long run reductions. I also provide evidence that spillovers are mediated by demolition-induced migration across gang territorial boundaries. I reconcile my findings with contradictory results from the existing literature by proposing and applying a test for control group contamination. I find that existing results are likely biased by previously unaccounted for spillovers. Labor economics Bayesian Crime Economics Education Empirical Bayes Labor
143	A Comprehensive Safety Analysis of Diverging Diamond Interchanges Lloyd, Holly 01 May 2016 (has links) As the population grows and the travel demands increase, alternative interchange designs are becoming increasingly popular. The diverging diamond interchange is one alternative design that has been implemented in the United States. This design can accommodate higher flow and unbalanced flow as well as improve safety at the interchange. As the diverging diamond interchange is increasingly considered as a possible solution to problematic interchange locations, it is imperative to investigate the safety effects of this interchange configuration. This report describes the selection of a comparison group of urban diamond interchanges, crash data collection, calibration of functions used to estimate the predicted crash rate in the before and after periods and the Empirical Bayes before and after analysis technique used to determine the safety effectiveness of the diverging diamond interchanges in Utah. A discussion of pedestrian and cyclist safety is also included. The analysis results demonstrated statistically significant decreases in crashes at most of the locations studied. This analysis can be used by UDOT and other transportation agencies as they consider the implementation of the diverging diamond interchanges in the future. diverging diamond interchange safety empirical bayes before-after Civil Engineering
144	An Exposition on Bayesian Inference Laffoon, John 01 May 1967 (has links) The Bayesian approach to probability and statistics is described, a brief history of Bayesianism is related, differences between Bayesian and Frequentist schools of statistics are defined, protential applications are investigated, and a literature survey is presented in the form of a machine-sort card file. Bayesian thought is increasing in favor among statisticians because of its ability to attack problems that are unassailable from the Frequentist approach. It should become more popular among practitioners because of the flexibility it allows experimenters and the ease with which prior knowledge can be combined with experimental data. (82 pages) bayesian inference probability behrens bayes Applied Mathematics Mathematics
145	Variational Bayesian Image Restoration with Transformation Parameter Estimation / 変換パラメータ推定による変分ベイズ画像復元 Sonogashira, Motoharu 26 March 2018 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第21208号 / 情博第661号 / 新制\|\|情\|\|114(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授美濃導彦, 教授河原達也, 教授中村裕一 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Image Restoration Variational Bayes Multiframe Denoising Deblurring Devignetting 007
146	Modeling the Performance of a Baseball Player's Offensive Production Smith, Michael Ross 09 March 2006 (has links) (PDF) This project addresses the problem of comparing the offensive abilities of players from different eras in Major League Baseball (MLB). We will study players from the perspective of an overall offensive summary statistic that is highly linked with scoring runs, or the Berry Value. We will build an additive model to estimate the innate ability of the player, the effect of the relative level of competition of each season, and the effect of age on performance using piecewise age curves. Using Hierarchical Bayes methodology with Gibbs sampling, we model each of these effects for each individual. The results of the Hierarchical Bayes model permit us to link players from different eras and to rank the players across the modern era of baseball (1900-2004) on the basis of their innate overall offensive ability. The top of the rankings, of which the top three were Babe Ruth, Lou Gehrig, and Stan Musial, include many Hall of Famers and some of the most productive offensive players in the history of the game. We also determine that trends in overall offensive ability in Major League Baseball exist based on different rule and cultural changes. Based on the model, MLB is currently at a high level of run production compared to the different levels of run production over the last century. baseball Hierarchical Bayes Gibbs sampler Metropolis-Hastings Statistics and Probability
147	Predicting Maximal Oxygen Consumption (VO2max) Levels in Adolescents Shepherd, Brent A. 09 March 2012 (has links) (PDF) Maximal oxygen consumption (VO2max) is considered by many to be the best overall measure of an individual's cardiovascular health. Collecting the measurement, however, requires subjecting an individual to prolonged periods of intense exercise until their maximal level, the point at which their body uses no additional oxygen from the air despite increased exercise intensity, is reached. Collecting VO2max data also requires expensive equipment and great subject discomfort to get accurate results. Because of this inherent difficulty, it is often avoided despite its usefulness. In this research, we propose a set of Bayesian hierarchical models to predict VO2max levels in adolescents, ages 12 through 17, using less extreme measurements. Two models are developed separately, one that uses submaximal exercise data and one that uses physical fitness questionnaire data. The best submaximal model was found to include age, gender, BMI, heart rate, rate of perceived exertion, treadmill miles per hour, and an interaction between age and heart rate. The second model, designed for those with physical limitations, uses age, gender, BMI, and two separate questionnaire results measuring physical activity levels and functional ability levels, as well as an interaction between the physical activity level score and gender. Both models use separate model variances for males and females. VO2max MCMC Bayesian Hierarchical Models Bayes Methods Statistics and Probability
148	COMBATING DISINFORMATION : Detecting fake news with linguistic models and classification algorithms / BEKÄMPNING AV DISINFORMATION : Upptäcka falska nyheter med språkliga modeller och klassificeringsalgoritmer Svärd, Mikael, Rumman, Philip January 2017 (has links) The purpose of this study is to examine the possibility of accurately distinguishing fabricated news from authentic news stories using Naive Bayes classification algorithms. This involves a comparative study of two different machine learning classification algorithms. The work also contains an overview of how linguistic text analytics can be utilized in detection purposes and an attempt to extract interesting information was made using Word Frequencies. A discussion of how different actors and parties in businesses and governments are affected by and how they handle deception caused by fake news articles was also made. This study further tries to ascertain what collective steps could be made towards introducing a functioning solution to combat fake news. The result swere inconclusive and the simple Naive Bayes algorithms used did not yieldfully satisfactory results. Word frequencies alone did not give enough information for detection. They were however found to be potentially useful as part of a larger set of algorithms and strategies as part of a solution to handling of misinformation. / Syftet med denna studie är att undersöka möjligheten att på ett pålitligt sättskilja mellan fabricerade och autentiska nyheter med hjälp av Naive bayesalgoritmer,detta involverar en komparativ studie mellan två olika typer avalgoritmer. Arbetet innehåller även en översikt över hur lingvistisk textanalyskan användas för detektion och ett försök gjordes att extrahera information medhjälp av ordfrekvenser. Det förs även en diskussion kring hur de olika aktörernaoch parterna inom näringsliv och regeringar påverkas av och hur de hanterarbedrägeri kopplat till falska nyheter. Studien försöker vidare undersöka vilkasteg som kan tas mot en fungerande lösning för att motarbeta falska nyheter. Algoritmernagav i slutändan otillfredställande resultat och ordfrekvenserna kundeinte ensamma ge nog med information. De tycktes dock potentiellt användbarasom en del i ett större maskineri av algoritmer och strategier ämnade att hanteradesinformation. fake news machine learning naive bayes Computer Sciences Datavetenskap (datalogi)
149	Product categorisation using machine learning / Produktkategorisering med hjälp av maskininlärning Stefan, Vasic, Nicklas, Lindgren January 2017 (has links) Machine learning is a method in data science for analysing large data sets and extracting hidden patterns and common characteristics in the data. Corporations often have access to databases containing great amounts of data that could contain valuable information. Navetti AB wants to investigate the possibility to automate their product categorisation by evaluating different types of machine learning algorithms. This could increase both time- and cost efficiency. This work resulted in three different prototypes, each using different machine learning algorithms with the ability to categorise products automatically. The prototypes were tested and evaluated based on their ability to categorise products and their performance in terms of speed. Different techniques used for preprocessing data is also evaluated and tested. An analysis of the tests shows that when providing a suitable algorithm with enough data it is possible to automate the manual categorisation. / Maskininlärning är en metod inom datavetenskap vars uppgift är att analysera stora mängder data och hitta dolda mönster och gemensamma karaktärsdrag. Företag har idag ofta tillgång till stora mängder data som i sin tur kan innehålla värdefull information. Navetti AB vill undersöka möjligheten att automatisera sin produktkategorisering genom att utvärdera olika typer av maskininlärnings- algoritmer. Detta skulle dramatiskt öka effektiviteten både tidsmässigt och ekonomiskt. Resultatet blev tre prototyper som implementerar tre olika maskininlärnings-algoritmer som automatiserat kategoriserar produkter. Prototyperna testades och utvärderades utifrån dess förmåga att kategorisera och dess prestanda i form av hastighet. Olika tekniker som används för att förbereda data analyseras och utvärderas. En analys av testerna visar att med tillräckligt mycket data och en passande algoritm så är det möjligt att automatisera den manuella kategoriseringen. Machine learning SVM Naive Bayes DBSCAN Computer Sciences Datavetenskap (datalogi)
150	Nonparametric And Empirical Bayes Estimation Methods Benhaddou, Rida 01 January 2013 (has links) In the present dissertation, we investigate two different nonparametric models; empirical Bayes model and functional deconvolution model. In the case of the nonparametric empirical Bayes estimation, we carried out a complete minimax study. In particular, we derive minimax lower bounds for the risk of the nonparametric empirical Bayes estimator for a general conditional distribution. This result has never been obtained previously. In order to attain optimal convergence rates, we use a wavelet series based empirical Bayes estimator constructed in Pensky and Alotaibi (2005). We propose an adaptive version of this estimator using Lepski’s method and show that the estimator attains optimal convergence rates. The theory is supplemented by numerous examples. Our study of the functional deconvolution model expands results of Pensky and Sapatinas (2009, 2010, 2011) to the case of estimating an (r + 1)-dimensional function or dependent errors. In both cases, we derive minimax lower bounds for the integrated square risk over a wide set of Besov balls and construct adaptive wavelet estimators that attain those optimal convergence rates. In particular, in the case of estimating a periodic (r + 1)-dimensional function, we show that by choosing Besov balls of mixed smoothness, we can avoid the ”curse of dimensionality” and, hence, obtain higher than usual convergence rates when r is large. The study of deconvolution of a multivariate function is motivated by seismic inversion which can be reduced to solution of noisy two-dimensional convolution equations that allow to draw inference on underground layer structures along the chosen profiles. The common practice in seismology is to recover layer structures separately for each profile and then to combine the derived estimates into a two-dimensional function. By studying the two-dimensional version of the model, we demonstrate that this strategy usually leads to estimators which are less accurate than the ones obtained as two-dimensional functional deconvolutions. Finally, we consider a multichannel deconvolution model with long-range dependent Gaussian errors. We do not limit our consideration to a specific type of long-range dependence, rather we assume that the eigenvalues of the covariance matrix of the errors are bounded above and below. We show that convergence rates of the estimators depend on a balance between the smoothness parameters of the response function, the iii smoothness of the blurring function, the long memory parameters of the errors, and how the total number of observations is distributed among the channels. Empirical bayes functional deconvolution minimax convergence rate wavelets Mathematics

Search results