Return to search

Statistical methods for the study of etiologic heterogeneity

Traditionally, cancer epidemiologists have investigated the causes of disease under the premise that patients with a certain site of disease can be treated as a single entity. Then risk factors associated with the disease are identified through case-control or cohort studies for the disease as a whole. However, with the rise of molecular and genomic profiling, in recent years biologic subtypes have increasingly been identified. Once subtypes are known, it is natural to ask the question of whether they share a common etiology, or in fact arise from distinct sets of risk factors, a concept known as etiologic heterogeneity. This dissertation seeks to evaluate methods for the study of etiologic heterogeneity in the context of cancer research and with a focus on methods for case-control studies. First, a number of existing regression-based methods for the study of etiologic heterogeneity in the context of pre-defined subtypes are compared using a data example and simulation studies. This work found that a standard polytomous logistic regression approach performs at least as well as more complex methods, and is easy to implement in standard software. Next, simulation studies investigate the statistical properties of an approach that combines the search for the most etiologically distinct subtype solution from high dimensional tumor marker data with estimation of risk factor effects. The method performs well when appropriate up-front selection of tumor markers is performed, even when there is confounding structure or high-dimensional noise. And finally, an application to a breast cancer case-control study demonstrates the usefulness of the novel clustering approach to identify a more risk heterogeneous class solution in breast cancer based on a panel of gene expression data and known risk factors.

Identiferoai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/d8-22xy-kf52
Date January 2019
CreatorsZabor, Emily Craig
Source SetsColumbia University
LanguageEnglish
Detected LanguageEnglish
TypeTheses

Page generated in 0.0023 seconds