Spelling suggestions: "subject:"curvival 2analysis"" "subject:"curvival 3analysis""
331 |
A Predictive Time-to-Event Modeling Approach with Longitudinal Measurements and Missing DataZhu, Lili January 2019 (has links)
An important practical problem in the survival analysis is predicting the time to a future event such as the death or failure of a subject. It is of great importance for the medical decision making to investigate how the predictor variables including repeated measurements of the same subjects are affecting the future time-to-event. Such a prediction problem is particularly more challenging due to the fact that the future values of predictor variables are unknown, and they may vary dynamically over time. In this dissertation, we consider a predictive approach based on modeling the forward intensity function. To handle the practical difficulty due to missing data in longitudinal measurements, and to accommodate observations at irregularly spaced time points, we propose a smoothed composite likelihood approach for estimations. The forward intensity function approach intrinsically incorporates the future dynamics in the predictor variables that affect the stochastic occurrence of the future event. Thus the proposed framework is advantageous and parsimonious from requiring no separated modeling step for the stochastic mechanism of the predictor variables. Our theoretical analysis establishes the validity of the forward intensity modeling approach and the smoothed composite likelihood method. To model the parameters as continuous functions of time, we introduce the penalized B-spline method into the proposed approach. Extensive simulations and real-data analyses demonstrate the promising performance of the proposed predictive approach. / Statistics
|
332 |
Improving the accuracy of statistics used in de-identification and model validation (via the concordance statistic) pertaining to time-to-event dataCaetano, Samantha-Jo January 2020 (has links)
Time-to-event data is very common in medical research. Thus, clinicians and patients need analysis of this data to be accurate, as it is often used to interpret disease screening results, inform treatment decisions, and identify at-risk patient groups (ie, sex, race, gene expressions, etc.). This thesis tackles three statistical issues pertaining to time-to-event data.
The first issue was incurred from an Institute for Clinical and Evaluative Sciences lung cancer registry data set, which was de-identified by censoring patients at an earlier date. This resulted in an underestimate of the observed times of censored patients. Five methods were proposed to account for the underestimation incurred by de-identification. A subsequent simulation study was conducted to compare the effectiveness of each method in reducing bias, and mean squared error as well as improving coverage probabilities of four different KM estimates. The simulation results demonstrated that situations with relatively large numbers of censored patients required methodology with larger perturbation. In these scenarios, the fourth proposed method (which perturbed censored times such that they were censored in the final year of study) yielded estimates with the smallest bias, mean squared error, and largest coverage probability. Alternatively, when there were smaller numbers of censored patients, any manipulation to the altered data set worsened the accuracy of the estimates.
The second issue arises when investigating model validation via the concordance (c) statistic. Specifically, the c-statistic is intended for measuring the accuracy of statistical models which assess the risk associated with a binary outcome. The c-statistic estimates the proportion of patient pairs where the patient with a higher predicted risk had experienced the event. The definition of a c-statistic cannot be uniquely extended to time-to-event outcomes, thus many proposals have been made. The second project developed a parametric c-statistic which assumes to the true survival times are exponentially distributed to invoke the memoryless property. A simulation study was conducted which included a comparative analysis of two other time-to-event c-statistics. Three different definitions of concordance in the time-to-event setting were compared, as were three different c-statistics. The c-statistic developed by the authors yielded the smallest bias when censoring is present in data, even when the exponentially distributed parametric assumptions do not hold. The c-statistic developed by the authors appears to be the most robust to censored data. Thus, it is recommended to use this c-statistic to validate prediction models applied to censored data.
The third project in this thesis developed and assessed the appropriateness of an empirical time-to-event c-statistic that is derived by estimating the survival times of censored patients via the EM algorithm. A simulation study was conducted for various sample sizes, censoring levels and correlation rates. A non-parametric bootstrap was employed and the mean and standard error of the bias of 4 different time-to-event c-statistics were compared, including the empirical EM c-statistic developed by the authors. The newly developed c-statistic yielded the smallest mean bias and standard error in all simulated scenarios. The c-statistic developed by the authors appears to be the most appropriate when estimating concordance of a time-to-event model. Thus, it is recommended to use this c-statistic to validate prediction models applied to censored data. / Thesis / Doctor of Philosophy (PhD)
|
333 |
Automatic Question Answering and Knowledge Discovery from Electronic Health RecordsWang, Ping 25 August 2021 (has links)
Electronic Health Records (EHR) data contain comprehensive longitudinal patient information, which is usually stored in databases in the form of either multi-relational structured tables or unstructured texts, e.g., clinical notes. EHR provides a useful resource to assist doctors' decision making, however, they also present many unique challenges that limit the efficient use of the valuable information, such as large data volume, heterogeneous and dynamic information, medical term abbreviations, and noisy nature caused by misspelled words.
This dissertation focuses on the development and evaluation of advanced machine learning algorithms to solve the following research questions: (1) How to seek answers from EHR for clinical activity related questions posed in human language without the assistance of database and natural language processing (NLP) domain experts, (2) How to discover underlying relationships of different events and entities in structured tabular EHRs, and (3) How to predict when a medical event will occur and estimate its probability based on previous medical information of patients.
First, to automatically retrieve answers for natural language questions from the structured tables in EHR, we study the question-to-SQL generation task by generating the corresponding SQL query of the input question. We propose a translation-edit model driven by a language generation module and an editing module for the SQL query generation task. This model helps automatically translate clinical activity related questions to SQL queries, so that the doctors only need to provide their questions in natural language to get the answers they need. We also create a large-scale dataset for question answering on tabular EHR to simulate a more realistic setting. Our performance evaluation shows that the proposed model is effective in handling the unique challenges about clinical terminologies, such as abbreviations and misspelled words.
Second, to automatically identify answers for natural language questions from unstructured clinical notes in EHR, we propose to achieve this goal by querying a knowledge base constructed based on fine-grained document-level expert annotations of clinical records for various NLP tasks. We first create a dataset for clinical knowledge base question answering with two sets: clinical knowledge base and question-answer pairs. An attention-based aspect-level reasoning model is developed and evaluated on the new dataset. Our experimental analysis shows that it is effective in identifying answers and also allows us to analyze the impact of different answer aspects in predicting correct answers.
Third, we focus on discovering underlying relationships of different entities (e.g., patient, disease, medication, and treatment) in tabular EHR, which can be formulated as a link prediction problem in graph domain. We develop a self-supervised learning framework for better representation learning of entities across a large corpus and also consider local contextual information for the down-stream link prediction task. We demonstrate the effectiveness, interpretability, and scalability of the proposed model on the healthcare network built from tabular EHR. It is also successfully applied to solve link prediction problems in a variety of domains, such as e-commerce, social networks, and academic networks.
Finally, to dynamically predict the occurrence of multiple correlated medical events, we formulate the problem as a temporal (multiple time-points) and multi-task learning problem using tensor representation. We propose an algorithm to jointly and dynamically predict several survival problems at each time point and optimize it with the Alternating Direction Methods of Multipliers (ADMM) algorithm. The model allows us to consider both the dependencies between different tasks and the correlations of each task at different time points. We evaluate the proposed model on two real-world applications and demonstrate its effectiveness and interpretability. / Doctor of Philosophy / Healthcare is an important part of our lives. Due to the recent advances of data collection and storing techniques, a large amount of medical information is generated and stored in Electronic Health Records (EHR). By comprehensively documenting the longitudinal medical history information about a large patient cohort, this EHR data forms a fundamental resource in assisting doctors' decision making including optimization of treatments for patients and selection of patients for clinical trials. However, EHR data also presents a number of unique challenges, such as (i) large-scale and dynamic data, (ii) heterogeneity of medical information, and (iii) medical term abbreviation. It is difficult for doctors to effectively utilize such complex data collected in a typical clinical practice. Therefore, it is imperative to develop advanced methods that are helpful for efficient use of EHR and further benefit doctors in their clinical decision making.
This dissertation focuses on automatically retrieving useful medical information, analyzing complex relationships of medical entities, and detecting future medical outcomes from EHR data. In order to retrieve information from EHR efficiently, we develop deep learning based algorithms that can automatically answer various clinical questions on structured and unstructured EHR data. These algorithms can help us understand more about the challenges in retrieving information from different data types in EHR. We also build a clinical knowledge graph based on EHR and link the distributed medical information and further perform the link prediction task, which allows us to analyze the complex underlying relationships of various medical entities. In addition, we propose a temporal multi-task survival analysis method to dynamically predict multiple medical events at the same time and identify the most important factors leading to the future medical events. By handling these unique challenges in EHR and developing suitable approaches, we hope to improve the efficiency of information retrieval and predictive modeling in healthcare.
|
334 |
Two Applied Economics Essays: Trade Duration in U.S. Fresh Fruit and Vegetable Imports & Goods-Time Elasticity of Substitution in Household Food Production for SNAP participants and nonparticipantsRudi, Jeta 08 August 2012 (has links)
The first study investigates the factors that impact the duration of U.S. fresh fruit and vegetable imports. We employ both survival analysis (Kaplan Meier estimates and Cox proportional hazards model) as well as count data models. Our results indicate that SPS treatment requirements positively impact the duration of trade while new market access has the opposite effect. Other factors typically included in trade duration models (such as: GDP, transportation costs, tariff rates, etc.) were also investigated. We also employ a probit model to understand the factors impacting the probability that a country selects into exporting fresh fruits and vegetables to the United States.
The second study estimates the goods-time elasticity of substitution for Food Stamp/SNAP participants versus non participants. We find that the elasticity of substitution for SNAP participants is not statistically different from zero. This indicates that SNAP participants have Leontief production function in household food production, implying that increasing the amount of SNAP benefits paid to participants will not lead to more food production if the time households dedicate to food preparation remains unchanged. This finding extends the analysis done by Baral, Davis and You (2011) and offers insights for policies related to the SNAP program. / Master of Science
|
335 |
Cure Rate Models with Nonparametric Form of Covariate EffectsChen, Tianlei 02 June 2015 (has links)
This thesis focuses on development of spline-based hazard estimation models for cure rate data. Such data can be found in survival studies with long term survivors. Consequently, the population consists of the susceptible and non-susceptible sub-populations with the latter termed as "cured". The modeling of both the cure probability and the hazard function of the susceptible sub-population is of practical interest. Here we propose two smoothing-splines based models falling respectively into the popular classes of two component mixture cure rate models and promotion time cure rate models.
Under the framework of two component mixture cure rate model, Wang, Du and Liang (2012) have developed a nonparametric model where the covariate effects on both the cure probability and the hazard component are estimated by smoothing splines. Our first development falls under the same framework but estimates the hazard component based on the accelerated failure time model, instead of the proportional hazards model in Wang, Du and Liang (2012). Our new model has better interpretation in practice.
The promotion time cure rate model, motivated from a simplified biological interpretation of cancer metastasis, was first proposed only a few decades ago. Nonetheless, it has quickly become a competitor to the mixture models. Our second development aims to provide a nonparametric alternative to the existing parametric or semiparametric promotion time models. / Ph. D.
|
336 |
Three Essays on Agricultural Trade PolicyNing, Xin 27 November 2019 (has links)
This dissertation consists of three essays examining the impacts of Sanitary and Phytosanitary (SPS) Measures on agricultural trade. The first essay estimates the impact of the 2003 Bovine Spongiform Encephalopathy (BSE) outbreak in the US on Japanese beef imports. I develop a source-differentiated demand system of fresh/chilled and frozen beef imports augmented with endogenous smooth transition functions. Results suggest that over one-half of the estimated income, own-price, and cross-price elasticities reached a new regime in the post-BSE period of Japanese beef imports where the competitive relationship and substitutability between US and Australian beef exports changed significantly. The second essay develops a product-line structural gravity model to estimate the trade flow effects of SPS measures that have been flagged as specific trade concerns in the World Trade Organization's (WTO's) SPS Committee meetings for the top 30 agricultural trading countries covering four major product sectors. Our findings are striking and call attention to the need for a deeper understanding of the impacts of SPS measures on WTO members' agricultural trade. Results show that the trade effects of SPS trade concern measures reduce exporters' agricultural trade by 67%, on average, during periods in which concerns were active. Significant heterogeneity in the trade effect of SPS measures exists with average estimated ad valorem equivalent tariffs ranging from 33% to 106%. The AVE effect of SPS concern measures maintained by the US is estimated at 42%, less than a half (a third) of the AVE effects of SPS concern measures imposed by the European Union (China). China's restrictions on Avian Influenza and ractopamine restrictions in pork exports are estimated to be the most prohibitive, causing an AVE effect of 120.3% and 88.9%, respectively. The third essay develops a discrete-time duration model to examine the extent to which these SPS concern measures affect the hazard rate of US agri-food exports during the 1995-2016 period. Results show that SPS concern measures raise the hazard rate of US agri-food exports by a range of 2.1%~15.3%, causing the predicted hazard rate to increase from 21.8% to a range of 23.6%~27.9%. This effect is heterogeneous across different agricultural sectors, with the most substantial effects occurring in US exports of meat, fruits, and vegetables. / Doctor of Philosophy / This dissertation consists of three essays on the examination of Sanitary and Phytosanitary (SPS) Measures and their impacts on agricultural trade. The first essay estimates the impact of the US 2003 Bovine Spongiform Encephalopathy (BSE) outbreaks on Japanese beef imports. Using a source-differentiated demand system of fresh/chilled and frozen beef imports embedded with endogenous smooth transition functions, we find that over one-half of the estimated income, own-price, and cross-price elasticities have changed remarkably, causing the Japanese beef import market to reach a new regime in the post-BSE period where the substitution and/or competition relationships between the US and Australia have changed. The second essay develops a product-line structural gravity model to estimate the trade effects of SPS measures flagged as concerns in the WTO's SPS Committee meetings for the top 30 agricultural trading countries covering four major product sectors. Results show that the trade effects of SPS concern measures are negative and significant, with the average estimated AVE tariffs ranging 33%~106%. The AVE effect of SPS concern measures maintained by the US is estimated to be 42%, less than a half (a third) of the AVE effects of SPS concern measures imposed by the European Union (China). China's restrictions on Avian Influenza and various ractopamine restrictions in the production and export of pork products are estimated to be the most prohibitive, causing an AVE effect of 120.3% and 88.9%, respectively. The third essay applies a discrete-time duration model to examine the extent to which SPS concern measures affect the hazard rate of US agri-food exports in 1995-2016. Results show that SPS concern measures raise the hazard rate of US agri-food exports by a range of 2.1%~15.3%, causing the predicted hazard rate to increase from 21.8% to a range of 23.6%~27.9%. This effect is heterogeneous across different agricultural sectors, with the most substantial effects occurring in US exports of meat, fruits, and vegetables.
|
337 |
A study of the effects of winterclimate and atmospheric icing onhigh-speed passenger trainsGranlöf, Markus January 2020 (has links)
Harsh winter climate causes various problems for both the public andprivate sector in Sweden, especially in the northern part and the railway industryis no exception. This master thesis project covers an investigation of the eects ofthe winter climate and a phenomena called atmospheric icing on the performance ofthe train in a region called the Botnia-Atlantica region. The investigation was donewith data over a short period January-February 2017 with simulated weather datafrom the Weather research and forecast model that was compared with the periodOctober - December 2016. The investigation only included high speed trains.The trains have been analysed based on two dierent performance measurements.The cumulative delay which is the increment in delay over a section and the currentdelay which is the current delay compared to the schedule. Cumulative delaysare investigated with survival analysis and the current delay is investigated with aMulti-state Markov model.The results show that the weather could have an eect on the trains performancewhere the survival analysis detected connection between the weather and cumulativedelays. The Markov model also showed a connection between the weather anddelayed trains including that the presence of atmospheric icing had a negative eecton remaining in a state of non-delay.
|
338 |
Assessment of the Potential of Concentrating Cancer Care in Hospitals With Certification Through Survival AnalysisBierbaum, Veronika, Schmitt, Jochen, Klinkhammer-Schalke, Monika, Schoffer, Olaf 11 September 2024 (has links)
Background Certification programs seek to improve the quality of complex interdisciplinary models of care such as cancer treatment through structuring the process of care in accordance with evidence-based guidelines. In Germany, the German Cancer Society (Deutsche Krebsgesellschaft, DKG) provides a certification programme for cancer care that covers more than one thousand centers. In a recent retrospective cohort study, it has been shown on a large, nationwide data set based on data from a statutory health insurance and selected clinical cancer registries, that there is a benefit in survival for cancer patients who have received initial treatment in hospitals certified by the DKG. Here, we deduce two absolute measures from the relative benefit in survival with the aim to quantify this benefit if all patients had been treated in a certified center.
Methods The WiZen study analysed survival of adult patients insured by the AOK with a cancer diagnosis between 2009 and 2017 in certified hospitals vs. non-certified hospitals. Besides Kaplan-Meier-estimators, Cox regression with shared frailty was used for 11 types of cancer in total, adjusting for patient-specific information such as demographic characteristics and comorbidities as well as hospital characteristics and temporal trend. Based on this regression, we predict adjusted survival curves that directly address the certification effect. From the adjusted survivals, we calculated years of life lost (YLL) and number needed to treat (NNT), along with a difference in deaths 5 years after diagnosis.
Results Based on our estimate for the 537,396 patients that were treated in a non-certified hospital included in the WiZen study, corresponding to 68,7% of the study population, we find a potential of 33,243 YLL per year in Germany based on the size of the German population as of 2017. The potential to avoid death cases 5 years from diagnosis totals 4,729 per year in Germany.
Conclusion While Cox regression is an important tool to evaluate the benefit that arises from variables with a potential impact on survival such as certification, its direct results are not well suited to quantify this benefit for decision makers in health care. The estimated years of life lost and the number of deaths that could have been avoided 5 years from diagnosis avoid mis-interpretation of the hazard ratios commonly used in survival analysis and should help to inform key stakeholders in health care without specialist background knowledge in statistics. Our measures, directly adressing the effect of certification, can furthermore be used as a starting point for health-economic calculations. Steering the care of cancer patients primarily to certified hospitals would have a high potential to improve outcomes. / Hintergrund:
Zertifizierungsprogramme zielen darauf ab, die Qualität komplexer interdisziplinärer Versorgungsmodelle wie der Krebsbehandlung zu verbessern, indem der Versorgungsprozess nach evidenzbasierten Leitlinien strukturiert wird. In Deutschland bietet die Deutsche Krebsgesellschaft (DKG) ein Zertifizierungsprogramm für die Krebsversorgung an, das mehr als tausend Zentren umfasst. In einer kürzlich durchgeführten retrospektiven Kohortenstudie wurde anhand eines großen, bundesweiten Datensatzes, der auf Daten einer gesetzlichen Krankenversicherung und ausgewählter klinischer Krebsregister basiert, gezeigt, dass es einen Überlebensvorteil für Krebspatienten gibt, die in von der DKG zertifizierten Krankenhäusern erstbehandelt wurden. Hier leiten wir aus dem relativen Überlebensvorteil zwei absolute Maße ab. Dies geschieht mit dem Ziel, das Potential dieses Vorteils zu quantifizieren für die Annahme, dass alle Patienten in einem zertifizierten Zentrum behandelt worden wären.
Methoden:
In der WiZen-Studie wurde das Überleben von erwachsenen AOK-Versicherten mit einer Krebsdiagnose zwischen 2009 und 2017 in zertifizierten Krankenhäusern im Vergleich zu nicht zertifizierten Krankenhäusern analysiert. Neben Kaplan-Meier-Schätzern wurde für insgesamt 11 Krebsarten eine Cox-Regression mit sog. „shared frailty“ verwendet, die für patientenspezifische Informationen wie demografische Merkmale und Komorbiditäten sowie Krankenhausmerkmale und den zeitlichen Verlauf adjustiert wurde. Auf der Grundlage dieser Regression berechnen wir adjustierte Überlebenskurven, die den Zertifizierungseffekt direkt berücksichtigen. Anhand dieser adjustierten Überlebenskurven werden die verlorenen Lebensjahre (Life Years lost, YLL) berechnet. Ebenfalls berechnet wird die Number needed to treat (NNT) für Überleben 5 Jahre nach Diagnosestellung und die daraus resultierende Anzahl vermeidbarer Todesfälle.
Ergebnisse:
Basierend auf unserer Schätzung für die 537.396 Patienten, die in der WiZen-Studie in einem nicht zertifizierten Krankenhaus behandelt wurden, was 68,7 % der Studienpopulation entspricht, finden wir ein Potenzial von 33.243 YLL pro Jahr in Deutschland, berechnet auf Grundlage der deutschen Bevölkerung im Jahr 2017. Das Potenzial zur Vermeidung von Todesfällen 5 Jahre nach der Diagnose beträgt in Deutschland 4.729 Fälle pro Jahr.
Schlussfolgerung:
Die Cox-Regression ist zwar ein wichtiges Instrument zur Bewertung des Nutzens, der sich aus Adjustierung mit Variablen mit potenziellem Einfluss auf das Überleben ergibt, wie z. B. der Zertifizierung, aber ihre direkten Ergebnisse sind nicht gut geeignet, um diesen Nutzen für Entscheidungsträger im Gesundheitswesen zu quantifizieren. Die geschätzten verlorenen Lebensjahre und die Anzahl der Todesfälle 5 Jahre nach Diagnose, die hätten vermieden werden können, beugen einer Fehlinterpretation der in der Überlebensanalyse üblicherweise verwendeten Hazard Ratios vor und können dazu beitragen, eine Ergebnisdarstellung für wichtige Akteure im Gesundheitswesen ohne spezielles Hintergrundwissen in Statistik zu erreichen. Die hier vorgestellten Maße, die sich direkt auf die Auswirkungen der Zertifizierung beziehen, können darüber hinaus als Ausgangspunkt für gesundheitsökonomische Berechnungen verwendet werden. Die Steuerung von Krebspatient:innen in zertifizierte Krankenhäuser hätte ein hohes Potenzial, das Überleben bei Krebs zu verbessern.
|
339 |
Do seasoned offerings improve the performance of issuing firms? Evidence from ChinaZhang, D., Wu, Yuliang, Ye, Q., Liu, J. 2018 August 1923 (has links)
Yes / This study provides new evidence that the performance of issuing firms varies by issue type, based on survival analysis methods. Our non-parametric results show that firms raising capital through rights issues, and notably through cash offers, experience a greater risk of delisting following issuance, as compared to those issuing convertible bonds. Our Cox model analyses demonstrate that plain equity issues, in contrast to convertible issues, are subject to different degrees of regulatory discipline, obligations and incentives in shaping survival trajectory. Further, high ownership concentration, agency issues intrinsic to equity offerings, weak shareholders' protection, and corporate ownership and governance and corporate control development at the time of an offer markedly influence post-issue survival. Plain equity issues, notably cash offers, are strongly linked with the agency costs of free cash flows. A large and truly independent board, allied to a separation of CEO and chairman powers, acts as a primary restraint on managers' self-interested behaviour. Such a cohesive governance mechanism can restrain rent-seeking in the firm's fundraising initiative. These observations hold when we take into account information available before an issue, at the time of an issue, and after an issue, demonstrating the robustness of our findings.
|
340 |
Adoption of Credit-Hour Reductions in Master of Divinity Programs at the Association of Theological Schools Member Institutions: An Event History AnalysisMcKanna, Nathan Jay 12 1900 (has links)
Seminaries in the United States have for more than two centuries sought to equip ministerial leaders for service within the community of faith. And yet these institutions have traditionally been the focus of very little quantitative research. This lack of data is particularly noteworthy given the existential crises many seminaries currently face, especially regarding their flagship Master of Divinity (MDiv) programs. Among seminary leadership, a common response to declining MDiv enrollment has been to decrease the length of the program, which historically required at least 90 credit hours. The purpose of this quantitative study was to explore change at the Association of Theological Schools member institutions (AMIs) between 2000–2019 through the lens of these credit-hour reductions. Longitudinal data from 113 AMIs were analyzed to examine the relationship between a variety of financial, enrollment, and institutional characteristics and the likelihood that an AMI would reduce its required MDiv credit hours. Results from an event history analysis revealed that, all else being equal, experiencing an increase in total revenues reduced an AMI's likelihood of making a reduction, while being a middle-age institution (founded 1870–1959) and having a higher percentage of peer institutions that made a change increased the likelihood of making a reduction. These findings, which well fit with existing theory on resource dependence and institutionalization, provide seminary leaders with objective data by which they can better understand the financial and cultural pressures impacting change at their institutions.
|
Page generated in 0.5113 seconds