131 |
Análise de correlação ecológica : uma abordagem inteiramente bayesiana para a mortalidade infantil no Rio Grande do SulKato,Sergio Kakuta January 2007 (has links)
A taxa de mortalidade infantil é um dos indicadores mais usados para medir a qualidade de vida da população. Um dos indicadores sócio-econômico do Rio Grande do Sul é o Índice de Desenvolvimento Sócio-econômico (IDESE) da Fundação de Economia e Estatística (FEE) que tem como um de seus componentes a taxa de mortalidade infantil. Geralmente os estudos relacionam a taxa de mortalidade infantil com fatores de risco associados às áreas em estudo de forma descritiva, ou seja, de forma apenas visual através de mapas. O presente trabalho apresenta uma aplicação de um dos métodos de Epidemiologia Espacial: Estudos de Correlação Ecológica, através de modelos hierárquicos e métodos inteiramente Bayesianos, utilizando covariáveis. Os principais problemas presentes nas taxas de mortalidade brutas ou nas SMR (Standardised Mortality Ratio) como a auto-correlação espacial e a instabilidade dos estimadores para pequenas áreas são discutidos. Para superar estas dificuldades as estimativas do risco relativo obtidas pela análise de regressão espacial, utilizando modelagem inteiramente Bayesiana, são apresentados como alternativa, pois além de incorporar componente espacialmente estruturado ao modelo, permite também a inclusão de covariáveis. No artigo são analisados os riscos de mortalidade infantil nos 496 municípios do Rio Grande do Sul para dados acumulados entre os anos de 2001 a 2004. Foram comparados vários modelos com diferentes especificações de componente espacial e covariáveis provenientes do IDESE-FEE/2003. Verificou-se que os modelos que utilizam a estrutura espacial além de covariáveis apresentaram melhor performance, quando comparado pelo critério DIC (Deviance Information Criterion). Comparando as SMRs com os riscos relativos obtidos pela modelagem inteiramente Bayesiana foi possível observar um ganho substancial na interpretação e na detecção de padrões de variação no risco de mortalidade infantil nos municípios do Rio Grande do Sul. / The infant mortality rate is one of the indicators used to measure the population’s life quality. The Rio Grande do Sul State has a social and economic indicator called Índice de Desenvolvimento Sócio-econômico (IDESE), maintained by the Economic and Statistics Foundation (FEE), which also uses the infant mortality rate. Usually, most studies relate the infant mortality rate with risk factors visually, aided by maps. This study presents the methodology and an application of one of the Spatial Epidemiology methods, the Ecologic Correlation, using Hierarchical Bayesian procedures. The main problems found in Ecologic correlations, such as the spatial autocorrelation and the estimator’s instability for small areas, are discussed. To overcome these difficulties, the relative risk estimate obtained by spatial regression analysis using fully Bayesian estimation method is presented. Presently, the rate of infant mortality is analysed in all 496 municipalities of the Rio Grande do Sul State, between the years 2001 to 2004. Several models with different specifications of spatial components and different variables from the IDESE-FEE/2003 were compared. It was found that the model with spatial structure and the Education variable showed better performance than other models. With this methodology was possible to obtain a more interpretable pattern of infant mortality risk in the Rio Grande do Sul State.
|
132 |
Modelo de gerenciamento de ativos na indústria sucroalcooleiraRebecca Negreiros Clemente, Thárcylla 31 January 2011 (has links)
Made available in DSpace on 2014-06-12T17:42:45Z (GMT). No. of bitstreams: 2
arquivo7558_1.pdf: 1535426 bytes, checksum: ffa48cf62475ac6c3b8f031949e0eda9 (MD5)
license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5)
Previous issue date: 2011 / Fundação de Amparo à Ciência e Tecnologia do Estado de Pernambuco / O Brasil é um país tropical que apresenta condições climáticas e de localização favoráveis ao cultivo da cana-de-açúcar. Estas características contribuem para que o país seja classificado como o maior produtor mundial da cana-de-açúcar. Essa cultura ocupa o terceiro lugar em relação à área plantada no país, gerando milhares de empregos diretos e indiretos. A maior parte dos responsáveis pela produção da cana-de-açúcar também são os responsáveis pelo seu beneficiamento e transformação. Neste contexto, a necessidade de planejar a produção e decidir sobre o investimento nos portfólios de subprodutos derivados dessa cultura torna-se evidente para os produtores e agentes do mercado. Compondo um portfólio de subprodutos derivado da cultura da cana-de-açúcar podem-se citar o açúcar (cristal, refinado, dentre outros tipos) e o etanol (anidro e hidratado). Estes são utilizados como ativos para o processo de investimento. Comumente no contexto de investimentos, duas questões são evidenciadas: o quanto e em quê investir o capital disponível, a incerteza inerente a esse processo incentiva o desenvolvimento de modelos de decisão que auxiliem os gestores a alocar seus recursos disponíveis dentre os ativos de investimentos, objetivando a maximização do retorno em detrimento da minimização do risco associado ao processo. Com o propósito de responder a esta situação, o presente trabalho apresentará um modelo aplicando conceitos de Análise de Decisão e Análise Bayesiana do Risco para auxiliar no processo de gerenciamento de ativos no mercado financeiro da indústria sucroalcooleira brasileira
|
133 |
Predictive models for chronic renal disease using decision trees, naïve bayes and case-based methodsKhan, Saqib Hussain January 2010 (has links)
Data mining can be used in healthcare industry to “mine” clinical data to discover hidden information for intelligent and affective decision making. Discovery of hidden patterns and relationships often goes intact, yet advanced data mining techniques can be helpful as remedy to this scenario. This thesis mainly deals with Intelligent Prediction of Chronic Renal Disease (IPCRD). Data covers blood, urine test, and external symptoms applied to predict chronic renal disease. Data from the database is initially transformed to Weka (3.6) and Chi-Square method is used for features section. After normalizing data, three classifiers were applied and efficiency of output is evaluated. Mainly, three classifiers are analyzed: Decision Tree, Naïve Bayes, K-Nearest Neighbour algorithm. Results show that each technique has its unique strength in realizing the objectives of the defined mining goals. Efficiency of Decision Tree and KNN was almost same but Naïve Bayes proved a comparative edge over others. Further sensitivity and specificity tests are used as statistical measures to examine the performance of a binary classification. Sensitivity (also called recall rate in some fields) measures the proportion of actual positives which are correctly identified while Specificity measures the proportion of negatives which are correctly identified. CRISP-DM methodology is applied to build the mining models. It consists of six major phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment.
|
134 |
Modelamiento y Estudio de la Red de Interacciones Proteicas del Complejo NRC/MASCCampos Valenzuela, Jaime Alberto January 2010 (has links)
La presente memoria tiene por objetivo investigar el sistema sináptico y levantar nuevas hipótesis acerca de la relación entre la organización de la densidad postsinaptica y el gatillamiento de enfermedades cognitivas, tales como, esquizofrenia, Alzheimer y retardo mental. Ello con la motivación de iniciar el desarrollo de nuevas terapias que permitan un ataque al mecanismo de estas enfermedades y no sólo a las consecuencias de ellas. En particular este trabajo explora nuevas metodologías en la inferencia de interacciones interproteicas y aplicar aquellas relaciones putativas en el estudio de la estructura receptora de glutamato NRC/MASC (NMDA receptor complex/ MAGUK associated signalling complex), ya que en la última década se ha determinado el rol fundamental del neurotransmisor glutamato en los procesos cognitivos y, por lo tanto, de la importancia de la recepción de él.
Para el desarrollo de los objetivos se propuso un protocolo nuevo, en donde se unen dos metodologías novedosas. En primer lugar, la aplicación del clasificador Naïve-Bayes para inferir interacciones interproteicas del ser humano, logrando de esa forma obtener una red de interacción más amplia y con un parámetro de confianza para cada uno de sus elementos. En segundo lugar, utilizando esta red inferida, en conjunto con otras redes disponibles en literatura, se llevó a cabo un estudio sistémico de la unidad NRC/MASC, y como ésta se ve afectada en sujetos con enfermedades cognitivas. Para ello se utilizó un algoritmo de clustering que permitió la definición de los módulos funcionales del complejo.
El primer resultado obtenido fue una red de interacciones interproteicas para el ser humano, compuesta por un número de proteínas comparable a las reportadas con anterioridad. La información disponible en estas redes fue integrada en un modelo único. Se seleccionaron los nodos pertenecientes al complejo receptor NRC/MASC, los que fueron agrupados en 12 módulos altamente especializados mediante el algoritmo de clustering. El análisis de las características de cada modulo permitió identificar una nueva organización no reportada en literatura: un gran módulo receptor conforma la capa de entrada de la señal de glutamato, seguido de una capa de modulación, para finalizar con la capa de módulos efectores. Por otro lado se designó una capa híbrida, con clusters con una función dual, tanto moduladores como efectores. Estos resultados permiten definir un nuevo modelo funcional del receptor, en donde se presentan una gran cantidad de vías de señalización y un aumento de la complejidad de las relaciones intermodulares. Además, se encontró que los clusters con una alta correlación con las enfermedades cognitivas serían el módulo receptor y el cluster modulador compuesto por 3 proteínas G.
Finalmente, esta memoria ha propuesto un modelo funcional para la unidad receptora NRC/MASC, cuya composición y características organizativas se diferencian de los reportados anteriormente. Estas características transforman este modelo en una herramienta novedosa para el estudio de los complejos mecanismos que hay detrás de enfermedades como esquizofrenia y retardo mental.
|
135 |
A comparison of hypothesis testing procedures for two population proportionsHort, Molly January 1900 (has links)
Master of Science / Department of Statistics / John E. Boyer Jr / It has been shown that the most straightforward approach to testing for the difference of two independent population proportions, called the Wald procedure, tends to declare differences too often. Because of this poor performance, various researchers have proposed simple adjustments to the Wald approach that tend to provide significance levels closer to the nominal. Additionally, several tests that take advantage of different methodologies have been proposed.
This paper extends the work of Tebbs and Roths (2008), who wrote an R program to compare confidence interval coverage for a variety of these procedures when used to estimate a contrast in two or more binomial parameters. Their program has been adapted to generate exact significance levels and power for the two parameter hypothesis testing situation.
Several combinations of binomial parameters and sample sizes are considered. Recommendations for a choice of procedure are made for practical situations.
|
136 |
Improving the Computational Efficiency in Bayesian Fitting of Cormack-Jolly-Seber Models with Individual, Continuous, Time-Varying CovariatesBurchett, Woodrow 01 January 2017 (has links)
The extension of the CJS model to include individual, continuous, time-varying covariates relies on the estimation of covariate values on occasions on which individuals were not captured. Fitting this model in a Bayesian framework typically involves the implementation of a Markov chain Monte Carlo (MCMC) algorithm, such as a Gibbs sampler, to sample from the posterior distribution. For large data sets with many missing covariate values that must be estimated, this creates a computational issue, as each iteration of the MCMC algorithm requires sampling from the full conditional distributions of each missing covariate value. This dissertation examines two solutions to address this problem. First, I explore variational Bayesian algorithms, which derive inference from an approximation to the posterior distribution that can be fit quickly in many complex problems. Second, I consider an alternative approximation to the posterior distribution derived by truncating the individual capture histories in order to reduce the number of missing covariates that must be updated during the MCMC sampling algorithm. In both cases, the increased computational efficiency comes at the cost of producing approximate inferences. The variational Bayesian algorithms generally do not estimate the posterior variance very accurately and do not directly address the issues with estimating many missing covariate values. Meanwhile, the truncated CJS model provides a more significant improvement in computational efficiency while inflating the posterior variance as a result of discarding some of the data. Both approaches are evaluated via simulation studies and a large mark-recapture data set consisting of cliff swallow weights and capture histories.
|
137 |
The Impact of Red Light Cameras on Injury Crashes within Miami-Dade County, FloridaLlau, Anthoni 27 April 2015 (has links)
Previous red light camera (RLC) studies have shown reductions in violations and overall and right angle collisions, however, they may also result in increases in rear-end crashes (Retting & Kyrychenko, 2002; Retting & Ferguson, 2003). Despite their apparent effectiveness, many RLC studies have produced imprecise findings due to inappropriate study designs and/or statistical techniques to control for biases (Retting & Kyrychenko, 2002), therefore, a more comprehensive approach is needed to accurately assess whether they reduce motor vehicle injury collisions. The objective of this proposal is to assess whether RLC’s improve safety at signalized intersections within Miami-Dade County, Florida. Twenty signalized intersections with RLC’s initiating enforcement on January 1st, 2011 were matched to two comparison sites located at least two miles from camera sites to minimize spillover effect. An Empirical Bayes analysis was used to account for regression to the mean. Incidences of all injury, red light running related injury, right-angle/turning, and rear-end collisions were examined. An index of effectiveness along with 95% CI’s were calculated.
During the first year of camera enforcement, RLC sites experienced a marginal decrease in right-angle/turn collisions, a significant increase in rear-end collisions, and significant decreases in all-injury and red light running-related injury collisions. An increase in right-angle/turning and rear-end collisions at the RLC sites was observed after two years despite camera enforcement. A significant reduction in red light running-related injury crashes, however, was still observed after two years. A non-significant decline in all injury collisions was also noted.
Findings of this research indicate RLC’s reduced red light running-related injury collisions at camera sites, yet its tradeoff was a large increase in rear-end collisions. Further, there was inconclusive evidence whether RLC’s affected right-angle/turning and all injury collisions. Statutory changes in crash reporting during the second year of camera enforcement affected the incidence of right-angle and rear-end collisions, nevertheless, a novelty effect could not be ruled out. A limitation of this study was the small number of injury crashes at each site. In conclusion, future research should consider events such as low frequencies of severe injury/fatal collisions and changes in crash reporting requirements when conducting RLC analyses.
|
138 |
Stochastic simulation of soil particle-size curves in heterogeneous aquifer systems through a Bayes space approachMenafoglio, A., Guadagnini, A., Secchi, P. 08 1900 (has links)
We address the problem of stochastic simulation of soil particle-size curves (PSCs) in heterogeneous aquifer systems. Unlike traditional approaches that focus solely on a few selected features of PSCs (e.g., selected quantiles), our approach considers the entire particle-size curves and can optionally include conditioning on available data. We rely on our prior work to model PSCs as cumulative distribution functions and interpret their density functions as functional compositions. We thus approximate the latter through an expansion over an appropriate basis of functions. This enables us to (a) effectively deal with the data dimensionality and constraints and (b) to develop a simulation method for PSCs based upon a suitable and well defined projection procedure. The new theoretical framework allows representing and reproducing the complete information content embedded in PSC data. As a first field application, we demonstrate the quality of unconditional and conditional simulations obtained with our methodology by considering a set of particle-size curves collected within a shallow alluvial aquifer in the Neckar river valley, Germany.
|
139 |
Bayesian Estimation of Small Proportions Using Binomial Group TestLuo, Shihua 09 November 2012 (has links)
Group testing has long been considered as a safe and sensible relative to one-at-a-time testing in applications where the prevalence rate p is small. In this thesis, we applied Bayes approach to estimate p using Beta-type prior distribution. First, we showed two Bayes estimators of p from prior on p derived from two different loss functions. Second, we presented two more Bayes estimators of p from prior on π according to two loss functions. We also displayed credible and HPD interval for p. In addition, we did intensive numerical studies. All results showed that the Bayes estimator was preferred over the usual maximum likelihood estimator (MLE) for small p. We also presented the optimal β for different p, m, and k.
|
140 |
Techniky umělé inteligence pro filtraci nevyžádané pošty / Artificial Intelligence Approaches for Filtering of SpamsMatula, Tomáš January 2014 (has links)
This thesis focuses on the e-mail classification and describes the basic ways of spam filtering. The Bayesian spam classifiers and artificial immune systems are analyzed and applied in this thesis. Furthermore, existing applications and evaluation metrics are described. The aim of this thesis is to design and implement an algorithm for spam filtering. Ultimately, the results are compared with selected known methods.
|
Page generated in 0.0423 seconds