Global ETD Search

31	Prediction of recurrent events Fredette, Marc January 2004 (has links) In this thesis, we will study issues related to prediction problems and put an emphasis on those arising when recurrent events are involved. First we define the basic concepts of frequentist and Bayesian statistical prediction in the first chapter. In the second chapter, we study frequentist prediction intervals and their associated predictive distributions. We will then present an approach based on asymptotically uniform pivotals that is shown to dominate the plug-in approach under certain conditions. The following three chapters consider the prediction of recurrent events. The third chapter presents different prediction models when these events can be modeled using homogeneous Poisson processes. Amongst these models, those using random effects are shown to possess interesting features. In the fourth chapter, the time homogeneity assumption is relaxed and we present prediction models for non-homogeneous Poisson processes. The behavior of these models is then studied for prediction problems with a finite horizon. In the fifth chapter, we apply the concepts discussed previously to a warranty dataset coming from the automobile industry. The number of processes in this dataset being very large, we focus on methods providing computationally rapid prediction intervals. Finally, we discuss the possibilities of future research in the last chapter. Statistics Prediction methods random effects models longitudinal study nonhomogeneous poisson processes coverage probability calibration approximate pivotals empirical bayes
32	Gene-Environment Interaction and Extension to Empirical Hierarchical Bayes Models in Genome-Wide Association Studies Viktorova, Elena 17 June 2014 (has links) No description available. 610 gene-environment interaction genome-wide association studies empirical Bayes lung cancer
33	Comparing survival from cancer using population-based cancer registry data - methods and applications Yu, Xue Qin January 2007 (has links) Doctor of Philosophy / Over the past decade, population-based cancer registry data have been used increasingly worldwide to evaluate and improve the quality of cancer care. The utility of the conclusions from such studies relies heavily on the data quality and the methods used to analyse the data. Interpretation of comparative survival from such data, examining either temporal trends or geographical differences, is generally not easy. The observed differences could be due to methodological and statistical approaches or to real effects. For example, geographical differences in cancer survival could be due to a number of real factors, including access to primary health care, the availability of diagnostic and treatment facilities and the treatment actually given, or to artefact, such as lead-time bias, stage migration, sampling error or measurement error. Likewise, a temporal increase in survival could be the result of earlier diagnosis and improved treatment of cancer; it could also be due to artefact after the introduction of screening programs (adding lead time), changes in the definition of cancer, stage migration or several of these factors, producing both real and artefactual trends. In this thesis, I report methods that I modified and applied, some technical issues in the use of such data, and an analysis of data from the State of New South Wales (NSW), Australia, illustrating their use in evaluating and potentially improving the quality of cancer care, showing how data quality might affect the conclusions of such analyses. This thesis describes studies of comparative survival based on population-based cancer registry data, with three published papers and one accepted manuscript (subject to minor revision). In the first paper, I describe a modified method for estimating spatial variation in cancer survival using empirical Bayes methods (which was published in Cancer Causes and Control 2004). I demonstrate in this paper that the empirical Bayes method is preferable to standard approaches and show how it can be used to identify cancer types where a focus on reducing area differentials in survival might lead to important gains in survival. In the second paper (published in the European Journal of Cancer 2005), I apply this method to a more complete analysis of spatial variation in survival from colorectal cancer in NSW and show that estimates of spatial variation in colorectal cancer can help to identify subgroups of patients for whom better application of treatment guidelines could improve outcome. I also show how estimates of the numbers of lives that could be extended might assist in setting priorities for treatment improvement. In the third paper, I examine time trends in survival from 28 cancers in NSW between 1980 and 1996 (published in the International Journal of Cancer 2006) and conclude that for many cancers, falls in excess deaths in NSW from 1980 to 1996 are unlikely to be attributable to earlier diagnosis or stage migration; thus, advances in cancer treatment have probably contributed to them. In the accepted manuscript, I described an extension of the work reported in the second paper, investigating the accuracy of staging information recorded in the registry database and assessing the impact of error in its measurement on estimates of spatial variation in survival from colorectal cancer. The results indicate that misclassified registry stage can have an important impact on estimates of spatial variation in stage-specific survival from colorectal cancer. Thus, if cancer registry data are to be used effectively in evaluating and improving cancer care, the quality of stage data might have to be improved. Taken together, the four papers show that creative, informed use of population-based cancer registry data, with appropriate statistical methods and acknowledgement of the limitations of the data, can be a valuable tool for evaluating and possibly improving cancer care. Use of these findings to stimulate evaluation of the quality of cancer care should enhance the value of the investment in cancer registries. They should also stimulate improvement in the quality of cancer registry data, particularly that on stage at diagnosis. The methods developed in this thesis may also be used to improve estimation of geographical variation in other count-based health measures when the available data are sparse.
34	The Safety Impact of Raising Speed Limit on Rural Freeways In Ohio Olufowobi, Oluwaseun Temitope 01 September 2020 (has links) No description available. Transportation Transportation Planning Civil Engineering Speed Limit Traffic Safety Rural Freeways Empirical Bayes Negative Binomial Safety Performance Function
35	Calibration of the Highway Safety Manual Safety Performance Function and Development of Jurisdiction-Specific Models for Rural Two-Lane Two-Way Roads in Utah Brimley, Bradford Keith 17 March 2011 (has links) (PDF) This thesis documents the results of the calibration of the Highway Safety Manual (HSM) safety performance function (SPF) for rural two-lane two-way roadway segments in Utah and the development of new SPFs using negative binomial and hierarchical Bayesian modeling techniques. SPFs estimate the safety of a roadway entity, such as a segment or intersection, in terms of number of crashes. The new SPFs were developed for comparison to the calibrated HSM SPF. This research was performed for the Utah Department of Transportation (UDOT).The study area was the state of Utah. Crash data from 2005-2007 on 157 selected study segments provided a 3-year observed crash frequency to obtain a calibration factor for the HSM SPF and develop new SPFs. The calibration factor for the HSM SPF for rural two-lane two-way roads in Utah is 1.16. This indicates that the HSM underpredicts the number of crashes on rural two-lane two-way roads in Utah by sixteen percent. The new SPFs were developed from the same data that were collected for the HSM calibration, with the addition of new data variables that were hypothesized to have a significant effect on crash frequencies. Negative binomial regression was used to develop four new SPFs, and one additional SPF was developed using hierarchical (or full) Bayesian techniques. The empirical Bayes (EB) method can be applied with each negative binomial SPF because the models include an overdispersion parameter used with the EB method. The hierarchical Bayesian technique is a newer, more mathematically-intense method that accounts for high levels of uncertainty often present in crash modeling. Because the hierarchical Bayesian SPF produces a density function of a predicted crash frequency, a comparison of this density function with an observed crash frequency can help identify segments with significant safety concerns. Each SPF has its own strengths and weaknesses, which include its data requirements and predicting capability. This thesis recommends that UDOT use Equation 5-11 (a new negative binomial SPF) for predicting crashes, because it predicts crashes with reasonable accuracy while requiring much less data than other models. The hierarchical Bayesian process should be used for evaluating observed crash frequencies to identify segments that may benefit from roadway safety improvements. safety performance functions Highway Safety Manual crash modification factors negative binomial empirical Bayes hierarchical Bayes safety Civil and Environmental Engineering
36	Crash Prediction Modeling for Curved Segments of Rural Two-Lane Two-Way Highways in Utah Knecht, Casey Scott 01 December 2014 (has links) (PDF) This thesis contains the results of the development of crash prediction models for curved segments of rural two-lane two-way highways in the state of Utah. The modeling effort included the calibration of the predictive model found in the Highway Safety Manual (HSM) as well as the development of Utah-specific models developed using negative binomial regression. The data for these models came from randomly sampled curved segments in Utah, with crash data coming from years 2008-2012. The total number of randomly sampled curved segments was 1,495. The HSM predictive model for rural two-lane two-way highways consists of a safety performance function (SPF), crash modification factors (CMFs), and a jurisdiction-specific calibration factor. For this research, two sample periods were used: a three-year period from 2010 to 2012 and a five-year period from 2008 to 2012. The calibration factor for the HSM predictive model was determined to be 1.50 for the three-year period and 1.60 for the five-year period. These factors are to be used in conjunction with the HSM SPF and all applicable CMFs. A negative binomial model was used to develop Utah-specific crash prediction models based on both the three-year and five-year sample periods. A backward stepwise regression technique was used to isolate the variables that would significantly affect highway safety. The independent variables used for negative binomial regression included the same set of variables used in the HSM predictive model along with other variables such as speed limit and truck traffic that were considered to have a significant effect on potential crash occurrence. The significant variables at the 95 percent confidence level were found to be average annual daily traffic, segment length, total truck percentage, and curve radius. The main benefit of the Utah-specific crash prediction models is that they provide a reasonable level of accuracy for crash prediction yet only require four variables, thus requiring much less effort in data collection compared to using the HSM predictive model. Highway Safety Manual safety performance functions crash modification factors negative binomial empirical Bayes safety horizontal curvature Civil and Environmental Engineering
37	Analyzing the Safety Effects of Edge Lane Roads for All Road Users Lamera, Marcial F 01 September 2020 (has links) (PDF) This thesis acts as one of the first studies that analyzes the safety effects of Edge Lane Roads (ELR) for all road users. This is important since ELRs can be a solution to many issues, such as alleviating congestion, increasing multimodality along roadways, and reducing maintenance costs. ELRs in both North America and Australia were observed. Starting with the North American ELRs, the following study designs were employed to estimate the safety of ELRs: (a) yoked comparison where each ELR installation was matched with at least two comparable 2-lane roads to serve as comparison sites and (b) an Empirical Bayes (EB) before/after analysis for ELR sites where requisite data on AADT and other relevant characteristics were available. Crash data was collected and compiled into four different groups: ELR before implementation, ELR after implementation, comparison site before ELR implementation, and comparison site after ELR comparison. The yoked comparison showed 9 of the 13 sites that had lower crash counts compared to their respective comparison sites. The EB analysis showed all 11 ELRs that were observed demonstrated a reduction in crashes. Moving to the Australian ELRs, the following study designs were employed: (c) analysis of general crash counts/trends, and (d) reverse EB analysis. The analysis of general crash counts and trends showed that each of the Australian ELRs exhibited very low amounts of crashes for 5 years, which further shows how safe these facilities are. Moving forward to the reverse EB analysis, 5 of the 8 ELR sites demonstrated a reduction in crashes. Overall, the results were generally favorable and indicated that ELRs provided a safer experience for cyclists, drivers, and pedestrians. More analysis is recommended as more data becomes available on these ELRs. Examples of this include using pedestrian and bicycle data to better understand the safety effects VRUs experience on North American facilities or gathering enough crash data to conduct 3-year reverse EB analyses for ELRs that were expanded to 2-lane roads. Hence, a recommendation can be made to implement a few experimental ELRs in rural locations throughout the State of California to help it meet its SB-1 objectives. Edge Lane Roads Advisory Bike Lanes Before/After Evaluation Empirical Bayes Analysis Transportation Engineering Urban, Community and Regional Planning
38	TESTING FOR DIFFERENTIALLY EXPRESSED GENES AND KEY BIOLOGICAL CATEGORIES IN DNA MICROARRAY ANALYSIS SARTOR, MAUREEN A. January 2007 (has links) No description available. microarray empirical Bayes hierarchical Bayesian model splines gene set enrichment analysis microarray data analysis posterior predictive p-values
39	Deep Learning for Ordinary Differential Equations and Predictive Uncertainty Yijia Liu (17984911) 19 April 2024 (has links) <p dir="ltr">Deep neural networks (DNNs) have demonstrated outstanding performance in numerous tasks such as image recognition and natural language processing. However, in dynamic systems modeling, the tasks of estimating and uncovering the potentially nonlinear structure of systems represented by ordinary differential equations (ODEs) pose a significant challenge. In this dissertation, we employ DNNs to enable precise and efficient parameter estimation of dynamic systems. In addition, we introduce a highly flexible neural ODE model to capture both nonlinear and sparse dependent relations among multiple functional processes. Nonetheless, DNNs are susceptible to overfitting and often struggle to accurately assess predictive uncertainty despite their widespread success across various AI domains. The challenge of defining meaningful priors for DNN weights and characterizing predictive uncertainty persists. In this dissertation, we present a novel neural adaptive empirical Bayes framework with a new class of prior distributions to address weight uncertainty.</p><p dir="ltr">In the first part, we propose a precise and efficient approach utilizing DNNs for estimation and inference of ODEs given noisy data. The DNNs are employed directly as a nonparametric proxy for the true solution of the ODEs, eliminating the need for numerical integration and resulting in significant computational time savings. We develop a gradient descent algorithm to estimate both the DNNs solution and the parameters of the ODEs by optimizing a fidelity-penalized likelihood loss function. This ensures that the derivatives of the DNNs estimator conform to the system of ODEs. Our method is particularly effective in scenarios where only a set of variables transformed from the system components by a given function are observed. We establish the convergence rate of the DNNs estimator and demonstrate that the derivatives of the DNNs solution asymptotically satisfy the ODEs determined by the inferred parameters. Simulations and real data analysis of COVID-19 daily cases are conducted to show the superior performance of our method in terms of accuracy of parameter estimates and system recovery, and computational speed.</p><p dir="ltr">In the second part, we present a novel sparse neural ODE model to characterize flexible relations among multiple functional processes. This model represents the latent states of the functions using a set of ODEs and models the dynamic changes of these states utilizing a DNN with a specially designed architecture and sparsity-inducing regularization. Our new model is able to capture both nonlinear and sparse dependent relations among multivariate functions. We develop an efficient optimization algorithm to estimate the unknown weights for the DNN under the sparsity constraint. Furthermore, we establish both algorithmic convergence and selection consistency, providing theoretical guarantees for the proposed method. We illustrate the efficacy of the method through simulation studies and a gene regulatory network example.</p><p dir="ltr">In the third part, we introduce a class of implicit generative priors to facilitate Bayesian modeling and inference. These priors are derived through a nonlinear transformation of a known low-dimensional distribution, allowing us to handle complex data distributions and capture the underlying manifold structure effectively. Our framework combines variational inference with a gradient ascent algorithm, which serves to select the hyperparameters and approximate the posterior distribution. Theoretical justification is established through both the posterior and classification consistency. We demonstrate the practical applications of our framework through extensive simulation examples and real-world datasets. Our experimental results highlight the superiority of our proposed framework over existing methods, such as sparse variational Bayesian and generative models, in terms of prediction accuracy and uncertainty quantification.</p> Deep learning Statistics not elsewhere classified Ordinary Differential Equations (ODE) Stochastic Gradient Algorithm Empirical Bayes Deep Neural Networks
40	Statistiniai metodai lietuvių kalbos sudėtingumo analizėje / The statistical methods in the analysis of the Lithuanian language complexity Piaseckienė, Karolina 22 September 2014 (has links) Pagrindinis darbo tikslas – pritaikyti matematinius ir statistinius metodus lietuvių kalbos analizėje, identifikuojant ir atsižvelgiant į lietuvių kalbos ypatumus, jos heterogeniškumą, sudėtingumą ir variabilumą. / The target of the work is to apply mathematical and statistical methods in the analysis of the Lithuanian language by identifying and taking into account peculiarities of the Lithuanian language, its heterogeneity, complexity and variability. Mathematics Logtiesinis modelis Logistinė regresija Zipfo dėsnis Struktūrinis skirstinys Empirinis Bajeso metodas Loglinear model Logistic regression Zipf's law Structural distribution Empirical Bayes approach

Search results