Spelling suggestions: "subject:"[een] INFORMATION CRITERION"" "subject:"[enn] INFORMATION CRITERION""
31 |
A Study of Designs in Clinical Trials and Schedules in Operating RoomsHung, Wan-Ping 20 January 2011 (has links)
The design of clinical trials is one of the important problems in medical statistics. Its main purpose is to determine the methodology and the sample size required of a testing study to examine the safety and efficacy of drugs. It is also a part of the Food and Drug Administration approval process. In this thesis, we first study the comparison of the efficacy of drugs in clinical trials. We focus on the two-sample comparison of proportions to investigate testing strategies based on two-stage design. The properties and advantages of the procedures from the proposed testing designs are demonstrated by numerical results, where comparison with the classical method is made under the same sample size. A real example discussed in Cardenal et al. (1999) is provided to explain how the methods may be used in practice. Some figures are also presented to illustrate the pattern changes of the power functions of these methods. In addition, the proposed procedure is also compared with the Pocock (1997) and O¡¦Brien and Fleming (1979) tests based on the standardized statistics.
In the second part of this work, the operating room scheduling problem is considered, which is also important in medical studies. The national health insurance system has been conducted more than ten years in Taiwan. The Bureau of National Health Insurance continues to improve the national health insurance system and try to establish a reasonable fee ratio for people in different income ranges. In accordance to the adjustment of the national health insurance system, hospitals must pay more attention to control the running cost. One of the major hospital's revenues is generated by its surgery center operations. In order to maintain financial balance, effective operating room management is necessary.
For this topic, this study focuses on the model fitting of operating times and operating room scheduling. Log-normal and mixture log-normal distributions are identified to be acceptable statistically in describing these operating times. The procedure is illustrated through analysis of thirteen operations performed in the gynecology department of a major teaching hospital in southern Taiwan. The best fitting distributions are used to evaluate performances of some operating combinations on daily schedule, which occurred in real data. The fitted distributions are selected through certain information criteria and bootstrapping the log-likelihood ratio test. Moreover, we also classify the operations into three different categories as well as three stages for each operation. Then based on the classification, a strategy of efficient scheduling is proposed. The benefits of rescheduling based on the proposed strategy are compared with the original scheduling observed.
|
32 |
Capturing patterns of spatial and temporal autocorrelation in ordered response data : a case study of land use and air quality changes in Austin, TexasWang, Xiaokun, 1979- 05 May 2015 (has links)
Many databases involve ordered discrete responses in a temporal and spatial context, including, for example, land development intensity levels, vehicle ownership, and pavement conditions. An appreciation of such behaviors requires rigorous statistical methods, recognizing spatial effects and dynamic processes. This dissertation develops a dynamic spatial ordered probit (DSOP) model in order to capture patterns of spatial and temporal autocorrelation in ordered categorical response data. This model is estimated in a Bayesian framework using Gibbs sampling and data augmentation, in order to generate all autocorrelated latent variables. The specifications, methodologies, and applications undertaken here advance the field of spatial econometrics while enhancing our understanding of land use and air quality changes. The proposed DSOP model incorporates spatial effects in an ordered probit model by allowing for inter-regional spatial interactions and heteroskedasticity, along with random effects across regions (where "region" describes any cluster of observational units). The model assumes an autoregressive, AR(1), process across latent response values, thereby recognizing time-series dynamics in panel data sets. The model code and estimation approach is first tested on simulated data sets, in order to reproduce known parameter values and provide insights into estimation performance. Root mean squared errors (RMSE) are used to evaluate the accuracy of estimates, and the deviance information criterion (DIC) is used for model comparisons. It is found that the DSOP model yields much more accurate estimates than standard, non-spatial techniques. As for model selection, even considering the penalty for using more parameters, the DSOP model is clearly preferred to standard OP, dynamic OP and spatial OP models. The model and methods are then used to analyze both land use and air quality (ozone) dynamics in Austin, Texas. In analyzing Austin's land use intensity patterns over a 4-point panel, the observational units are 300 m × 300 m grid cells derived from satellite images (at 30 m resolution). The sample contains 2,771 such grid cells, spread among 57 clusters (zip code regions), covering about 10% of the overall study area. In this analysis, temporal and spatial autocorrelation effects are found to be significantly positive. In addition, increases in travel times to the region's central business district (CBD) are estimated to substantially reduce land development intensity. The observational units for the ozone variation analysis are 4 km × 4 km grid cells, and all 132 observations falling in the study area are used. While variations in ozone concentration levels are found to exhibit strong patterns of temporal autocorrelation, they appear strikingly random in a spatial context (after controlling for local land cover, transportation, and temperature conditions). While transportation and land cover conditions appear to influence ozone levels, their effects are not as instantaneous, nor as practically significant as the impact of temperature. The proposed and tested DSOP model is felt to be a significant contribution to the field of spatial econometrics, where binary applications (for discrete response data) have been seen as the cutting edge. The Bayesian framework and Gibbs sampling techniques used here permit such complexity, in world of two-dimensional autocorrelation. / text
|
33 |
Bayesian model estimation and comparison for longitudinal categorical dataTran, Thu Trung January 2008 (has links)
In this thesis, we address issues of model estimation for longitudinal categorical data and of model selection for these data with missing covariates. Longitudinal survey data capture the responses of each subject repeatedly through time, allowing for the separation of variation in the measured variable of interest across time for one subject from the variation in that variable among all subjects. Questions concerning persistence, patterns of structure, interaction of events and stability of multivariate relationships can be answered through longitudinal data analysis. Longitudinal data require special statistical methods because they must take into account the correlation between observations recorded on one subject. A further complication in analysing longitudinal data is accounting for the non- response or drop-out process. Potentially, the missing values are correlated with variables under study and hence cannot be totally excluded. Firstly, we investigate a Bayesian hierarchical model for the analysis of categorical longitudinal data from the Longitudinal Survey of Immigrants to Australia. Data for each subject is observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia. Secondly, we examine the Bayesian model selection techniques of the Bayes factor and Deviance Information Criterion for our regression models with miss- ing covariates. Computing Bayes factors involve computing the often complex marginal likelihood p(y|model) and various authors have presented methods to estimate this quantity. Here, we take the approach of path sampling via power posteriors (Friel and Pettitt, 2006). The appeal of this method is that for hierarchical regression models with missing covariates, a common occurrence in longitudinal data analysis, it is straightforward to calculate and interpret since integration over all parameters, including the imputed missing covariates and the random effects, is carried out automatically with minimal added complexi- ties of modelling or computation. We apply this technique to compare models for the employment status of immigrants to Australia. Finally, we also develop a model choice criterion based on the Deviance In- formation Criterion (DIC), similar to Celeux et al. (2006), but which is suitable for use with generalized linear models (GLMs) when covariates are missing at random. We define three different DICs: the marginal, where the missing data are averaged out of the likelihood; the complete, where the joint likelihood for response and covariates is considered; and the naive, where the likelihood is found assuming the missing values are parameters. These three versions have different computational complexities. We investigate through simulation the performance of these three different DICs for GLMs consisting of normally, binomially and multinomially distributed data with missing covariates having a normal distribution. We find that the marginal DIC and the estimate of the effective number of parameters, pD, have desirable properties appropriately indicating the true model for the response under differing amounts of missingness of the covariates. We find that the complete DIC is inappropriate generally in this context as it is extremely sensitive to the degree of missingness of the covariate model. Our new methodology is illustrated by analysing the results of a community survey.
|
34 |
Seleção de modelos multiníveis para dados de avaliação educacional / Selection of multilevel models for educational evaluation dataFabiano Rodrigues Coelho 11 August 2017 (has links)
Quando um conjunto de dados possui uma estrutura hierárquica, uma possível abordagem são os modelos de regressão multiníveis, que se justifica pelo fato de haver uma porção significativa da variabilidade dos dados que pode ser explicada por níveis macro. Neste trabalho, desenvolvemos a seleção de modelos de regressão multinível aplicados a dados educacionais. Esta análise divide-se em duas partes: seleção de variáveis e seleção de modelos. Esta última subdivide-se em dois casos: modelagem clássica e modelagem bayesiana. Buscamos através de critérios como o Lasso, AIC, BIC, WAIC entre outros, encontrar quais são os fatores que influenciam no desempenho em matemática dos alunos do nono ano do ensino fundamental do estado de São Paulo. Também investigamos o funcionamento de cada um dos critérios de seleção de variáveis e de modelos. Foi possível concluir que, sob a abordagem frequentista, o critério de seleção de modelos BIC é o mais eficiente, já na abordagem bayesiana, o critérioWAIC apresentou melhores resultados. Utilizando o critério de seleção de variáveis Lasso para abordagem clássica, houve uma diminuição de 34% dos preditores do modelo. Por fim, identificamos que o desempenho em matemática dos estudantes do nono ano do ensino fundamental do estado de São Paulo é influenciado pelas seguintes covariáveis: grau de instrução da mãe, frequência de leitura de livros, tempo gasto com recreação em dia de aula, o fato de gostar de matemática, o desempenho em matemática global da escola, desempenho em língua portuguesa do aluno, dependência administrativa da escola, sexo, grau de instrução do pai, reprovações e distorção idade-série. / When a dataset contains a hierarchical data structure, a possible approach is the multilevel regression modelling, which is justified by the significative amout of the data variability that can be explained by macro level processes. In this work, a selection of multilevel regression models for educational data is developed. This analysis is divided into two parts: variable selection and model selection. The latter is subdivided into two categories: classical and Bayesian modeling. Traditional criteria for model selection such as Lasso, AIC, BIC, and WAIC, among others are used in this study as an attempt to identify the factors influencing ninth grade students performance in Mathematics of elementary education in the State of São Paulo. Likewise, an investigation was conducted to evaluate the performance of each variable selection criteria and model selection methods applied to fitted models that will be mentioned throughout this work. It was possible to conclude that, under the frequentist approach, BIC is the most efficient, whereas under the bayesian approach, WAIC presented better results. Using Lasso under the frequentist approach, a decrease of 34% on the number of predictors was observed. Finally, we identified that the performance in Mathematics of students in the ninth year of elementary school in the state of São Paulo is most influenced by the following covariates: mothers educational level, frequency of book reading, time spent with recreation in classroom, the fact of liking Math, school global performance in Mathematics, performance in Portuguese, school administrative dependence, gender, fathers educational degree, failures and age-grade distortion.
|
35 |
Některé postupy pro detekce změn ve statistických modelech / Some procedures for detection of changes in statistical modelsMarešová, Linda January 2017 (has links)
No description available.
|
36 |
Studies on development of analytical methods to quantify protein aggregates and prediction of soluble/insoluble aggregate-formation / タンパク質の重合体に関する分析法開発及び可溶性/不溶性重合体形成予測に関する研究Fukuda, Jun 23 March 2015 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(農学) / 甲第19025号 / 農博第2103号 / 新制||農||1030(附属図書館) / 学位論文||H27||N4907(農学部図書室) / 31976 / 京都大学大学院農学研究科応用生命科学専攻 / (主査)教授 加納 健司, 教授 植田 和光, 教授 植田 充美 / 学位規則第4条第1項該当 / Doctor of Agricultural Science / Kyoto University / DFAM
|
37 |
Study of Generalized Lomax Distribution and Change Point ProblemAlghamdi, Amani Saeed 23 July 2018 (has links)
No description available.
|
38 |
Model selection for discrete Markov random fields on graphs / Seleção de modelos para campos aleatórios Markovianos discretos sobre grafosFrondana, Iara Moreira 28 June 2016 (has links)
In this thesis we propose to use a penalized maximum conditional likelihood criterion to estimate the graph of a general discrete Markov random field. We prove the almost sure convergence of the estimator of the graph in the case of a finite or countable infinite set of variables. Our method requires minimal assumptions on the probability distribution and contrary to other approaches in the literature, the usual positivity condition is not needed. We present several examples with a finite set of vertices and study the performance of the estimator on simulated data from theses examples. We also introduce an empirical procedure based on k-fold cross validation to select the best value of the constant in the estimators definition and show the application of this method in two real datasets. / Nesta tese propomos um critério de máxima verossimilhança penalizada para estimar o grafo de dependência condicional de um campo aleatório Markoviano discreto. Provamos a convergência quase certa do estimador do grafo no caso de um conjunto finito ou infinito enumerável de variáveis. Nosso método requer condições mínimas na distribuição de probabilidade e contrariamente a outras abordagens da literatura, a condição usual de positividade não é necessária. Introduzimos alguns exemplos com um conjunto finito de vértices e estudamos o desempenho do estimador em dados simulados desses exemplos. Também propomos um procedimento empírico baseado no método de validação cruzada para selecionar o melhor valor da constante na definição do estimador, e mostramos a aplicação deste procedimento em dois conjuntos de dados reais.
|
39 |
Algoritmos genéticos em inferência de redes gênicasJiménez, Ray Dueñas January 2014 (has links)
Orientador: Prof. Dr. David Correa Martins Júnior / Dissertação (mestrado) - Universidade Federal do ABC, Programa de Pós-Graduação em Ciência da Computação, 2014.
|
40 |
Bayesian Methods in Gaussian Graphical ModelsMitsakakis, Nikolaos 31 August 2010 (has links)
This thesis contributes to the field of Gaussian Graphical Models by exploring either numerically or theoretically various topics of Bayesian Methods in Gaussian Graphical Models and by providing a number of interesting results, the further exploration of which would be promising, pointing to numerous future research directions.
Gaussian Graphical Models are statistical methods for the investigation and representation of interdependencies between components of continuous random vectors. This thesis aims to investigate some issues related to the application of Bayesian methods for Gaussian Graphical Models. We adopt the popular $G$-Wishart conjugate prior $W_G(\delta,D)$ for the precision matrix. We propose an efficient sampling method for the $G$-Wishart distribution based on the Metropolis Hastings algorithm and show its validity through a number of numerical experiments. We show that this method can be easily used to estimate the Deviance Information Criterion, providing a computationally inexpensive approach for model selection.
In addition, we look at the marginal likelihood of a graphical model given a set of data. This is proportional to the ratio of the posterior over the prior normalizing constant. We explore methods for the estimation of this ratio, focusing primarily on applying the Monte Carlo simulation method of path sampling. We also explore numerically the effect of the completion of the incomplete matrix $D^{\mathcal{V}}$, hyperparameter of the $G$-Wishart distribution, for the estimation of the normalizing constant.
We also derive a series of exact and approximate expressions for the Bayes Factor between two graphs that differ by one edge. A new theoretical result regarding the limit of the normalizing constant multiplied by the hyperparameter $\delta$ is given and its implications to the validity of an improper prior and of the subsequent Bayes Factor are discussed.
|
Page generated in 0.0552 seconds