Global ETD Search

41	Distinguishing between Chronic and Transient Poverty in Mozambique Groover, Kimberly Darnton 01 July 2011 (has links) The main purpose of the study is to identify household characteristics which can 1) distinguish between the chronic poor and transient poor and 2) be feasibly implemented as targeting criterion in poverty interventions. Data for this study was drawn from Mozambique's 2008/09 Household Budget Survey and consisted of 10,832 observations. This study fills a gap in the literature by structurally determining the impact of common shocks (drought, floods and cyclones, agricultural pests, illness, death, and theft) on 1) food expenditures at the household level and 2) poverty rates at the national level. The results of the study indicate that shocks are one of the key determinants of household food expenditures. The expected impact of shocks in aggregate increases the national poverty rate by 9%. However, the impact of specific shocks on household food expenditures varies across regions and households. Further, the variables which are strongly correlated with chronic poverty differ from the variables strongly correlated with transient poverty. These results suggest the need to both more rapidly identify and enroll households exposed to shocks in short-term social protection programs and continue to improve methods targeting the chronic poor in long-term programs. / Master of Science endogenous treatment effect model poverty reduction developing countries Mozambique covariate shocks transient poverty chronic poverty Sub-Saharan Africa
42	MULTI-STATE MODELS WITH MISSING COVARIATES Lou, Wenjie 01 January 2016 (has links) Multi-state models have been widely used to analyze longitudinal event history data obtained in medical studies. The tools and methods developed recently in this area require the complete observed datasets. While, in many applications measurements on certain components of the covariate vector are missing on some study subjects. In this dissertation, several likelihood-based methodologies were proposed to deal with datasets with different types of missing covariates efficiently when applying multi-state models. Firstly, a maximum observed data likelihood method was proposed when the data has a univariate missing pattern and the missing covariate is a categorical variable. The construction of the observed data likelihood function is based on the model of a joint distribution of the response longitudinal event history data and the discrete covariate with missing values. Secondly, we proposed a maximum simulated likelihood method to deal with the missing continuous covariate when applying multi-state models. The observed data likelihood function was approximated by using the Monte Carlo simulation method. At last, an EM algorithm was used to deal with multiple missing covariates when estimating the parameters of multi-state model. The EM algorithm would be able to handle multiple missing discrete covariates in general missing pattern efficiently. All the proposed methods are justified by simulation studies and applications to the datasets from the SMART project, a consortium of 11 different high-quality longitudinal studies of aging and cognition. Longitudinal event history data multi-state model missing covariate data EM algorithm maximum simulated likelihood SMART project Applied Statistics Statistical Models
43	Improved Methods and Selecting Classification Types for Time-Dependent Covariates in the Marginal Analysis of Longitudinal Data Chen, I-Chen 01 January 2018 (has links) Generalized estimating equations (GEE) are popularly utilized for the marginal analysis of longitudinal data. In order to obtain consistent regression parameter estimates, these estimating equations must be unbiased. However, when certain types of time-dependent covariates are presented, these equations can be biased unless an independence working correlation structure is employed. Moreover, in this case regression parameter estimation can be very inefficient because not all valid moment conditions are incorporated within the corresponding estimating equations. Therefore, approaches using the generalized method of moments or quadratic inference functions have been proposed for utilizing all valid moment conditions. However, we have found that such methods will not always provide valid inference and can also be improved upon in terms of finite-sample regression parameter estimation. Therefore, we propose a modified GEE approach and a selection method that will both ensure the validity of inference and improve regression parameter estimation. In addition, these modified approaches assume the data analyst knows the type of time-dependent covariate, although this likely is not the case in practice. Whereas hypothesis testing has been used to determine covariate type, we propose a novel strategy to select a working covariate type in order to avoid potentially high type II error rates with these hypothesis testing procedures. Parameter estimates resulting from our proposed method are consistent and have overall improved mean squared error relative to hypothesis testing approaches. Finally, for some real-world examples the use of mean regression models may be sensitive to skewness and outliers in the data. Therefore, we extend our approaches from their use with marginal quantile regression to modeling the conditional quantiles of the response variable. Existing and proposed methods are compared in simulation studies and application examples. Generalized Estimating Equations Time-Dependent Covariate Empirical Covariance Matrix Working Correlation Structure Mean Squared Error Marginal Quantile Regression Applied Statistics Biostatistics Statistical Models
44	Advanced Designs of Cancer Phase I and Phase II Clinical Trials Cui, Ye 13 May 2013 (has links) The clinical trial is the most import study for the development of successful novel drugs. The aim of this dissertation is to develop innovative statistical methods to overcome the three main obstacles in clinical trials: (1) lengthy trial duration and inaccurate maximum tolerated dose (MTD) in phase I trials; (2) heterogeneity in drug effect when patients are given the same prescription and same dose; and (3) high failure rates of expensive phase III confirmatory trials due to the discrepancy in the endpoints adopted in phase II and III trials. Towards overcoming the first obstacle, we originally develop a hybrid design for the time-to-event dose escalation method with overdose control using a normalized equivalent toxicity score (NETS) system. This hybrid design can substantially reduce sample size, shorten study length, and estimate accurate MTD by employing a parametric model and adaptive Bayesian approach. Toward overcoming the second obstacle, we propose a new approach to incorporate patients’ characteristic using our proposed design in phase I clinical trials which considers the personalized information for patients who participant in the trials. To conquer the third obstacle, we propose a novel two-stage screening design for phase II trials whereby the endpoint of percent change in of tumor size is used in an initial screening to select potentially effective agents within a short time interval followed by a second screening stage where progression free survival is estimated to confirm the efficacy of agents. These research projects will substantially benefit both cancer patients and researchers by improving clinical trial efficiency and reducing cost and trial duration. Moreover, they are of great practical meaning since cancer medicine development is of paramount importance to human health care. Clinical trial design Phase I Phase II Maximum tolerated dose (MTD) Adaptive design Personalized MTD Covariate effect Two-stage design Success rate Trial efficiency
45	Learning under differing training and test distributions Bickel, Steffen January 2008 (has links) One of the main problems in machine learning is to train a predictive model from training data and to make predictions on test data. Most predictive models are constructed under the assumption that the training data is governed by the exact same distribution which the model will later be exposed to. In practice, control over the data collection process is often imperfect. A typical scenario is when labels are collected by questionnaires and one does not have access to the test population. For example, parts of the test population are underrepresented in the survey, out of reach, or do not return the questionnaire. In many applications training data from the test distribution are scarce because they are difficult to obtain or very expensive. Data from auxiliary sources drawn from similar distributions are often cheaply available. This thesis centers around learning under differing training and test distributions and covers several problem settings with different assumptions on the relationship between training and test distributions-including multi-task learning and learning under covariate shift and sample selection bias. Several new models are derived that directly characterize the divergence between training and test distributions, without the intermediate step of estimating training and test distributions separately. The integral part of these models are rescaling weights that match the rescaled or resampled training distribution to the test distribution. Integrated models are studied where only one optimization problem needs to be solved for learning under differing distributions. With a two-step approximation to the integrated models almost any supervised learning algorithm can be adopted to biased training data. In case studies on spam filtering, HIV therapy screening, targeted advertising, and other applications the performance of the new models is compared to state-of-the-art reference methods. / Eines der wichtigsten Probleme im Maschinellen Lernen ist das Trainieren von Vorhersagemodellen aus Trainingsdaten und das Ableiten von Vorhersagen für Testdaten. Vorhersagemodelle basieren üblicherweise auf der Annahme, dass Trainingsdaten aus der gleichen Verteilung gezogen werden wie Testdaten. In der Praxis ist diese Annahme oft nicht erfüllt, zum Beispiel, wenn Trainingsdaten durch Fragebögen gesammelt werden. Hier steht meist nur eine verzerrte Zielpopulation zur Verfügung, denn Teile der Population können unterrepräsentiert sein, nicht erreichbar sein, oder ignorieren die Aufforderung zum Ausfüllen des Fragebogens. In vielen Anwendungen stehen nur sehr wenige Trainingsdaten aus der Testverteilung zur Verfügung, weil solche Daten teuer oder aufwändig zu sammeln sind. Daten aus alternativen Quellen, die aus ähnlichen Verteilungen gezogen werden, sind oft viel einfacher und günstiger zu beschaffen. Die vorliegende Arbeit beschäftigt sich mit dem Lernen von Vorhersagemodellen aus Trainingsdaten, deren Verteilung sich von der Testverteilung unterscheidet. Es werden verschiedene Problemstellungen behandelt, die von unterschiedlichen Annahmen über die Beziehung zwischen Trainings- und Testverteilung ausgehen. Darunter fallen auch Multi-Task-Lernen und Lernen unter Covariate Shift und Sample Selection Bias. Es werden mehrere neue Modelle hergeleitet, die direkt den Unterschied zwischen Trainings- und Testverteilung charakterisieren, ohne dass eine einzelne Schätzung der Verteilungen nötig ist. Zentrale Bestandteile der Modelle sind Gewichtungsfaktoren, mit denen die Trainingsverteilung durch Umgewichtung auf die Testverteilung abgebildet wird. Es werden kombinierte Modelle zum Lernen mit verschiedenen Trainings- und Testverteilungen untersucht, für deren Schätzung nur ein einziges Optimierungsproblem gelöst werden muss. Die kombinierten Modelle können mit zwei Optimierungsschritten approximiert werden und dadurch kann fast jedes gängige Vorhersagemodell so erweitert werden, dass verzerrte Trainingsverteilungen korrigiert werden. In Fallstudien zu Email-Spam-Filterung, HIV-Therapieempfehlung, Zielgruppenmarketing und anderen Anwendungen werden die neuen Modelle mit Referenzmethoden verglichen. Maschinelles Lernen Verteilungsunterschied Selektionsbias Multi-Task-Lernen Machine Learning Covariate Shift Sample Selection Bias Multi Task Learning Data processing Computer science
46	Development and Evaluation of Nonparametric Mixed Effects Models Baverel, Paul January 2011 (has links) A nonparametric population approach is now accessible to a more comprehensive network of modelers given its recent implementation into the popular NONMEM application, previously limited in scope by standard parametric approaches for the analysis of pharmacokinetic and pharmacodynamic data. The aim of this thesis was to assess the relative merits and downsides of nonparametric models in a nonlinear mixed effects framework in comparison with a set of parametric models developed in NONMEM based on real datasets and when applied to simple experimental settings, and to develop new diagnostic tools adapted to nonparametric models. Nonparametric models as implemented in NONMEM VI showed better overall simulation properties and predictive performance than standard parametric models, with significantly less bias and imprecision in outcomes of numerical predictive check (NPC) from 25 real data designs. This evaluation was carried on by a simulation study comparing the relative predictive performance of nonparametric and parametric models across three different validation procedures assessed by NPC. The usefulness of a nonparametric estimation step in diagnosing distributional assumption of parameters was then demonstrated through the development and the application of two bootstrapping techniques aiming to estimate imprecision of nonparametric parameter distributions. Finally, a novel covariate modeling approach intended for nonparametric models was developed with good statistical properties for identification of predictive covariates. In conclusion, by relaxing the classical normality assumption in the distribution of model parameters and given the set of diagnostic tools developed, the nonparametric approach in NONMEM constitutes an attractive alternative to the routinely used parametric approach and an improvement for efficient data analysis. nonparametric model pharmacometrics pharmacokinetics pharmacodynamic imprecision covariate analysis parameter distribution Statistics, computer and systems science Statistik, data- och systemvetenskap Pharmaceutical pharmacology Farmaceutisk farmakologi
47	Bayesian and classical inference for extensions of Geometric Exponential distribution with applications in survival analysis under the presence of the data covariated and randomly censored / Gianfelice, Paulo Roberto de Lima. January 2020 (has links) Orientador: Fernando Antonio Moala / Abstract: This work presents a study of probabilistic modeling, with applications to survival analysis, based on a probabilistic model called Exponential Geometric (EG), which o ers great exibility for the statistical estimation of its parameters based on samples of life time data complete and censored. In this study, the concepts of estimators and lifetime data are explored under random censorship in two cases of extensions of the EG model: the Extended Geometric Exponential (EEG) and the Generalized Extreme Geometric Exponential (GE2). The work still considers, exclusively for the EEG model, the approach of the presence of covariates indexed in the rate parameter as a second source of variation to add even more exibility to the model, as well as, exclusively for the GE2 model, a analysis of the convergence, hitherto ignored, it is proposed for its moments. The statistical inference approach is performed for these extensions in order to expose (in the classical context) their maximum likelihood estimators and asymptotic con dence intervals, and (in the bayesian context) their a priori and a posteriori distributions, both cases to estimate their parameters under random censorship, and covariates in the case of EEG. In this work, bayesian estimators are developed with the assumptions that the prioris are vague, follow a Gamma distribution and are independent between the unknown parameters. The results of this work are regarded from a detailed study of statistical simulation applied to... (Complete abstract click electronic access below) / Resumo: Este trabalho apresenta um estudo de modelagem probabilística, com aplicações à análise de sobrevivência, fundamentado em um modelo probabilístico denominado Exponencial Geométrico (EG), que oferece uma grande exibilidade para a estimação estatística de seus parâmetros com base em amostras de dados de tempo de vida completos e censurados. Neste estudo são explorados os conceitos de estimadores e dados de tempo de vida sob censuras aleatórias em dois casos de extensões do modelo EG: o Exponencial Geom étrico Estendido (EEG) e o Exponencial Geométrico Extremo Generalizado (GE2). O trabalho ainda considera, exclusivamente para o modelo EEG, a abordagem de presença de covariáveis indexadas no parâmetro de taxa como uma segunda fonte de variação para acrescentar ainda mais exibilidade para o modelo, bem como, exclusivamente para o modelo GE2, uma análise de convergência até então ignorada, é proposta para seus momentos. A abordagem da inferência estatística é realizada para essas extensões no intuito de expor (no contexto clássico) seus estimadores de máxima verossimilhança e intervalos de con ança assintóticos, e (no contexto bayesiano) suas distribuições à priori e posteriori, ambos os casos para estimar seus parâmetros sob as censuras aleatórias, e covariáveis no caso do EEG. Neste trabalho os estimadores bayesianos são desenvolvidos com os pressupostos de que as prioris são vagas, seguem uma distribuição Gama e são independentes entre os parâmetros desconhecidos. Os resultad... (Resumo completo, clicar acesso eletrônico abaixo) / Mestre Censored and covariate data Statistical simulation Dados censurados e covariados Simulação estatística
48	Considerations for Identifying and Conducting Cluster Randomized Trials / Considerations For Identifying and Conducting Cluster Trials Al-Jaishi, Ahmed January 2021 (has links) Background: The cluster randomized trial design randomly assigns groups of people to different treatment arms. This dissertation aimed to (1) develop machine learning algorithms to identify cluster trials in bibliographic databases, (2) assess reporting of methodological and ethical elements in hemodialysis-related cluster trials, and (3) assess how well two covariate-constrained randomization methods balanced baseline characteristics compared with simple randomization. Methods: In study 1, we developed three machine learning algorithms that classify whether a bibliographic citation is a CRT report or not. We only used the information available in an article citation, including the title, abstract, keywords, and subject headings. In study 2, we conducted a systematic review of CRTs in the hemodialysis setting to review the reporting of key methodological and ethical issues. We reviewed CRTs published in English between 2000 and 2019 and indexed in MEDLINE or EMBASE. In study 3, we assessed how well two covariate-constrained randomization methods balanced baseline characteristics compared with simple randomization. Results: In study 1, we successfully developed high-performance algorithms that identified whether a citation was a CRT. Our algorithms had greater than 97% sensitivity and 77% specificity in identifying CRTs. For study 2, we found suboptimal conduct and reporting of methodological issues of CRTs in the hemodialysis setting and incomplete reporting of key ethical issues. For study 3, where we randomized 72 clusters, constraining the randomization using historical information achieved a better balance on baseline characteristics than simple randomization; however, the magnitude of benefit was modest. Conclusions: This dissertation's results will help researchers quickly identify cluster trials in bibliographic databases (study 1) and inform the design and analyses of future Canadian trials conducted within the hemodialysis setting (study 2 & 3). / Thesis / Doctor of Philosophy (PhD) / The cluster trial design randomly assigns groups of people to different treatment arms rather than individuals. Cluster trials are commonly used in research areas such as education, public health, and health service research. Examples of clusters can include villages/communities, worksites, schools, hospitals, hospital wards, and physicians. This dissertation aimed to (1) develop machine learning algorithms to identify cluster trials in bibliographic databases, (2) assess reporting of methodological and ethical elements in hemodialysis-related cluster trials, and (3) identified best practices for randomly assigning hemodialysis centers in cluster trials. We conducted three studies to address these aims. The results of this dissertation will help researchers quickly identify cluster trials in bibliographic databases (study 1) and inform the design and analyses of future Canadian trials conducted within the hemodialysis setting (study 2 & 3). Cluster randomized trial Hemodialysis Machine learning Systematic review Bibliographic Databases Prediction Sensitivity and Specificity Ethics Informed Consent Covariate-constrained Restricted randomization Randomization Balanced allocation
49	Bayesian Approaches for Synthesising Evidence in Health Technology Assessment McCarron, Catherine Elizabeth 04 1900 (has links) <p><strong>ABSTRACT</strong></p> <p><strong>Background and Objectives</strong>:<strong> </strong>Informed health care decision making depends on the available evidence base. Where the available evidence comes from different sources methods are required that can synthesise all of the evidence. The synthesis of different types of evidence poses various methodological challenges. The objective of this thesis is to investigate the use of Bayesian methods for combining evidence on effects from randomised and non-randomised studies and additional evidence from the literature with patient level trial data. <strong> </strong></p> <p><strong>Methods</strong>: Using a Bayesian three-level hierarchical model an approach was proposed to combine evidence from randomised and non-randomised studies while adjusting for potential imbalances in patient covariates. The proposed approach was compared to four other Bayesian methods using a case study of endovascular versus open surgical repair for the treatment of abdominal aortic aneurysms. In order to assess the performance of the proposed approach beyond this single applied example a simulation study was conducted. The simulation study examined a series of Bayesian approaches under a variety of scenarios. The subsequent research focussed on the use of informative prior distributions to integrate additional evidence with patient level data in a Bayesian cost-effectiveness analysis comparing endovascular and open surgical repair in terms of incremental costs and life years gained.</p> <p><strong>Results and Conclusions</strong>: The shift in the estimated odds ratios towards those of the more balanced randomised studies, observed in the case study, suggested that the proposed Bayesian approach was capable of adjusting for imbalances. These results were reinforced in the simulation study. The impact of the informative priors in terms of increasing estimated mean life years in the control group, demonstrated the potential importance of incorporating all available evidence in the context of an economic evaluation. In addressing these issues this research contributes to comprehensive evidence based decision making in health care.</p> / Doctor of Philosophy (PhD) health care decision making cost-effectiveness informative priors covariate adjustment randomised controlled trials non-randomised studies Medicine and Health Sciences Medicine and Health Sciences
50	Statistical Modeling and Predictions Based on Field Data and Dynamic Covariates Xu, Zhibing 12 December 2014 (has links) Reliability analysis plays an important role in keeping manufacturers in a competitive position. It can be applied in many areas such as warranty predictions, maintenance scheduling, spare parts provisioning, and risk assessment. This dissertation focuses on statistical modeling and predictions based on lifetime data, degradation data, and recurrent event data. The datasets used in this dissertation come from the field, and have complicated structures. The dissertation consists of three main chapters, in addition to Chapter 1 which is the introduction chapter, and Chapter 5 which is the general conclusion chapter. Chapter 2 consists of the traditional time-to-failure data analysis. We propose a statistical method to address the failure data from an appliance used at home with the consideration of retirement times and delayed reporting time. We also develop a prediction method based on the proposed model. Using the information of retirement-time distribution and delayed reporting time, the predictions are more accurate and useful in the decision making. In Chapter 3, we introduce a nonlinear mixed-effects general path model to incorporate dynamic covariates into degradation data analysis. Dynamic covariates include time-varying environmental variables and usage condition. The shapes of the effect functions of covariates may be constrained to be, for example, monotonically increasing (i.e., higher temperature is likely to cause more damage). Incorporating dynamic covariates with shape restrictions is challenging. A modified alternative algorithm and the corresponding prediction method are proposed. In Chapter 4, we introduce a multi-level trend-renewal process (MTRP) model to describe component-level events in multi-level repairable systems. In particular, we consider two-level repairable systems in which events can occur at the subsystem level, or the component (within the subsystem) level. The main goal is to develop a method for estimation of model parameters and a procedure for prediction of the future replacement events at component level with the consideration of the effects from the subsystem replacement events. To explain unit-to-unit variability, time-dependent covariates as well as random effects are introduced into the heterogeneous MTRP model (HMTRP). A Metropolis-within-Gibbs algorithm is used to estimate the unknown parameters in the HMTRP model. The proposed method is illustrated by a simulated dataset. / Ph. D. Calibration Covariate process Degradation path Failure reporting delay Multi-level repairable systems Trend-renewal process Nonlinear mixed-effects model Organic coatings Prediction interval Weibull distribution

Search results