1 |
Bayesian regression and discrimination with many variablesChang, Kai-Ming January 2002 (has links)
No description available.
|
2 |
Generalizing the multivariate normality assumption in the simulation of dependencies in transportation systemsNg, Man Wo 22 November 2010 (has links)
By far the most popular method to account for dependencies in the transportation
network analysis literature is the use of the multivariate normal (MVN) distribution.
While in certain cases there is some theoretical underpinning for the MVN assumption, in
others there is none. This can lead to misleading results: results do not only depend on
whether dependence is modeled, but also how dependence is modeled. When assuming
the MVN distribution, one is limiting oneself to a specific set of dependency structures,
which can substantially limit validity of results. In this report an existing, more flexible,
correlation-based approach (where just marginal distributions and their correlations are
specified) is proposed, and it is demonstrated that, in simulation studies, such an
approach is a generalization of the MVN assumption. The need for such generalization is
particularly critical in the transportation network modeling literature, where oftentimes there exists no or insufficient data to estimate probability distributions, so that sensitivity
analyses assuming different dependence structures could be extremely valuable.
However, the proposed method has its own drawbacks. For example, it is again not able
to exhaust all possible dependence forms and it relies on some not-so-known properties
of the correlation coefficient. / text
|
3 |
Bayesian Logistic Regression Model with Integrated Multivariate Normal Approximation for Big DataFu, Shuting 28 April 2016 (has links)
The analysis of big data is of great interest today, and this comes with challenges of improving precision and efficiency in estimation and prediction. We study binary data with covariates from numerous small areas, where direct estimation is not reliable, and there is a need to borrow strength from the ensemble. This is generally done using Bayesian logistic regression, but because there are numerous small areas, the exact computation for the logistic regression model becomes challenging. Therefore, we develop an integrated multivariate normal approximation (IMNA) method for binary data with covariates within the Bayesian paradigm, and this procedure is assisted by the empirical logistic transform. Our main goal is to provide the theory of IMNA and to show that it is many times faster than the exact logistic regression method with almost the same accuracy. We apply the IMNA method to the health status binary data (excellent health or otherwise) from the Nepal Living Standards Survey with more than 60,000 households (small areas). We estimate the proportion of Nepalese in excellent health condition for each household. For these data IMNA gives estimates of the household proportions as precise as those from the logistic regression model and it is more than fifty times faster (20 seconds versus 1,066 seconds), and clearly this gain is transferable to bigger data problems.
|
4 |
Modelo de calibração ultraestrutural / Ultrastructural calibration modelTalarico, Alina Marcondes 23 January 2014 (has links)
Os programas de Ensaios de Prociência (EP) são utilizados pela sociedade para avaliar a competência e a confiabilidade de laboratórios na execução de medições específicas. Atualmente, diversos grupos de EP foram estabelecidos pelo INMETRO, entre estes, o grupo de testes de motores. Cada grupo é formado por diversos laboratórios que medem o mesmo artefato e suas medições são comparadas através de métodos estatísticos. O grupo de motores escolheu um motor gasolina 1.0, gentilmente cedido pela GM Powertrain, como artefato. A potência do artefato foi medida em 10 pontos de rotação por 6 laboratórios. Aqui, motivados por este conjunto de dados, estendemos o modelo de calibração comparativa de Barnett (1969) para avaliar a compatibilidade dos laboratórios considerando a distribuição t de Student e apresentamos os resultados obtidos das aplicações e simulações a este conjunto de dados / Proficiency Testing (PT) programs are used by society to assess the competence and the reliability in laboratories execution of specific measurements. Nowadays many PT groups were established by INMETRO, including the motor\'s test group. Each group is formed by laboratories measuring the same artifact and their measurements are compared through statistic methods. The motor\'s group chose a gasoline engine 1.0, kindly provided by GM as an artifact. The artifact\'s power was measured at ten points of rotation by 6 laboratories. Here, motivated by this set data, we extend the Barnet comparative calibration model (1969) to assess the compatibility of the laboratories considering the Student-t distribution and show the results obtained from application and simulation of this set data
|
5 |
Modelo de calibração ultraestrutural / Ultrastructural calibration modelAlina Marcondes Talarico 23 January 2014 (has links)
Os programas de Ensaios de Prociência (EP) são utilizados pela sociedade para avaliar a competência e a confiabilidade de laboratórios na execução de medições específicas. Atualmente, diversos grupos de EP foram estabelecidos pelo INMETRO, entre estes, o grupo de testes de motores. Cada grupo é formado por diversos laboratórios que medem o mesmo artefato e suas medições são comparadas através de métodos estatísticos. O grupo de motores escolheu um motor gasolina 1.0, gentilmente cedido pela GM Powertrain, como artefato. A potência do artefato foi medida em 10 pontos de rotação por 6 laboratórios. Aqui, motivados por este conjunto de dados, estendemos o modelo de calibração comparativa de Barnett (1969) para avaliar a compatibilidade dos laboratórios considerando a distribuição t de Student e apresentamos os resultados obtidos das aplicações e simulações a este conjunto de dados / Proficiency Testing (PT) programs are used by society to assess the competence and the reliability in laboratories execution of specific measurements. Nowadays many PT groups were established by INMETRO, including the motor\'s test group. Each group is formed by laboratories measuring the same artifact and their measurements are compared through statistic methods. The motor\'s group chose a gasoline engine 1.0, kindly provided by GM as an artifact. The artifact\'s power was measured at ten points of rotation by 6 laboratories. Here, motivated by this set data, we extend the Barnet comparative calibration model (1969) to assess the compatibility of the laboratories considering the Student-t distribution and show the results obtained from application and simulation of this set data
|
6 |
Explicit Estimators for a Banded Covariance Matrix in a Multivariate Normal DistributionKarlsson, Emil January 2014 (has links)
The problem of estimating mean and covariances of a multivariate normal distributedrandom vector has been studied in many forms. This thesis focuses on the estimatorsproposed in [15] for a banded covariance structure with m-dependence. It presents theprevious results of the estimator and rewrites the estimator when m = 1, thus makingit easier to analyze. This leads to an adjustment, and a proposition for an unbiasedestimator can be presented. A new and easier proof of consistency is then presented.This theory is later generalized into a general linear model where the correspondingtheorems and propositions are made to establish unbiasedness and consistency. In thelast chapter some simulations with the previous and new estimator verifies that thetheoretical results indeed makes an impact.
|
7 |
A novel approach to modeling and predicting crash frequency at rural intersections by crash type and injury severity levelDeng, Jun, active 2013 24 March 2014 (has links)
Safety at intersections is of significant interest to transportation professionals due to the large number of possible conflicts that occur at those locations. In particular, rural intersections have been recognized as one of the most hazardous locations on roads.
However, most models of crash frequency at rural intersections, and road segments in general, do not differentiate between crash type (such as angle, rear-end or sideswipe) and injury severity (such as fatal injury, non-fatal injury, possible injury or property damage only). Thus, there is a need to be able to identify the differential impacts of intersection-specific and other variables on crash types and severity levels. This thesis builds upon the work of Bhat et al., (2013b) to formulate and apply a novel approach for the joint modeling of crash frequency and combinations of crash type and injury severity. The proposed framework explicitly links a count data model (to model crash frequency) with a discrete choice model (to model combinations of crash type and injury severity), and uses a multinomial probit kernel for the discrete choice model and introduces unobserved heterogeneity in both the crash frequency model and the discrete choice model, while also accommodates excess of zeros. The results show that the type of traffic control and the number of entering roads are the most important determinants of crash counts and crash type/injury severity, and the results from our analysis underscore the value of our proposed model for data fit purposes as well as to accurately estimate variable effects. / text
|
8 |
Geometry of high dimensional Gaussian dataMossberg, Olof Samuel January 2024 (has links)
Collected data may simultaneously be of low sample size and high dimension. Such data exhibit some geometric regularities consisting of a single observation being a rotation on a sphere, and a pair of observations being orthogonal. This thesis investigates these geometric properties in some detail. Background is provided and various approaches to the result are discussed. An approach based on the mean value theorem is eventually chosen, being the only candidate investigated that gives explicit convergence bounds. The bounds are tested employing Monte Carlo simulation and found to be adequate. / Data som insamlas kan samtidigt ha en liten stickprovsstorlek men vara högdimensionell. Sådan data uppvisar vissa geometriska mönster som består av att en enskild observation är en rotation på en sfär, och att ett par av observationer är rätvinkliga. Den här uppsatsen undersöker dessa geometriska egenskaper mer detaljerat. En bakgrund ges och olika typer av angreppssätt diskuteras. Till slut väljs en metod som baseras på medelvärdessatsen eftersom detta är den enda av de undersökta metoderna som ger explicita konvergensgränser. Gränserna testas sedermera med Monte Carlo-simulering och visar sig stämma.
|
9 |
Introduction to Probability TheoryChen, Yong-Yuan 25 May 2010 (has links)
In this paper, we first present the basic principles of set theory and combinatorial analysis which are the most useful tools in computing probabilities. Then, we show some important properties derived from axioms of probability. Conditional probabilities come into play not only when some partial information is available, but also as a tool to compute probabilities more easily, even when partial information is unavailable. Then, the concept of random variable and its some related properties are introduced. For univariate random variables, we introduce the basic properties of some common discrete and continuous distributions. The important properties of jointly distributed random variables are also considered. Some inequalities, the law of large numbers and the central limit theorem are discussed. Finally, we introduce additional topics the Poisson process.
|
Page generated in 0.1548 seconds