Return to search

Predicting patient-specific outcome based on machine learning algorithms using genomic data of patients with locally advanced head and neck squamous cell carcinoma

Aufgrund der heterogenen Tumorbiologie variiert der Therapieerfolg bei lokal fortgeschrittenen Plattenepithelkarzinomen stark, woraus ein mittleres 5-Jahres-Überleben dieser Patienten von etwa 50% resultiert. Um die Therapie besser an die Tumoreigenschaften anzupassen, muss die Therapieresistenz der Tumoren vor der Behandlung bestimmt werden. In dieser Dissertationsschrift werden Methoden aus dem Bereich des maschinellen Lernens angewandt um Genexpressionsdaten zu analysieren, um so Signaturen und Modelle zu erzeugen die eine Klassifizierung der Tumoren in verschiedene Risikogruppen bezüglich der loko-regionären Tumorkontrolle erlauben. Für Patienten, die mit postoperativer Radiochemotherapie behandelt wurden, konnte eine 7-Gen Signatur entwickelt und erfolgreich validiert werden. Außerdem konnte gezeigt werden, dass verschiedene Signaturen ähnlich gut zur Patientenklassifizierung geeignet sein können. Daher wurde eine Methode vorgeschlagen, die es erlaubt verschiedene prognostiche Modelle zu kombinieren. Weiterhin wurden verschiedene genbasierte Biomarker zwischen verschiedenen Genexpressionsmessmethoden verglichen. In den resultierenden Patienteneinteilungen zeigten Biomarker, die auf Signaturen basieren, eine geringere Variabilität als Biomarker, die auf einzelnen Genen basieren.:Abbreviations VII
Figures IX
Tables XII
1 Introduction 1
2 Biological & Statistical Background 4
2.1 Head and Neck Squamous Cell Carcinoma 4
2.1.1 Tumorigenesis 4
2.1.2 Biomarkers 8
2.2 Statistics 14
2.2.1 Survival analysis 14
2.2.2 Model and data evaluation 18
2.2.3 Data sampling methods 22
2.3 Machine learning algorithms 23
2.3.1 Feature selection algorithms 24
2.3.2 Prognostic models 27
2.4 Gene expression measurement methods 30
2.4.1 Real-time polymerase chain reaction (RT-PCR) 31
2.4.2 nCounter® gene expression 32
2.4.3 In situ-synthesized oligonucleotide microarrays 32
3 Material and methods 35
3.1 Patient cohorts 35
3.1.1 Primary radiochemotherapy (pRCTx) cohorts 35
3.1.2 Postoperative radio(chemo)therapy (PORT-C) cohorts 36
3.1.3 Clinical endpoints 38
3.2 Gene expression analyses 39
3.2.1 HPV status 39
3.2.2 Immunohistochemical staining 39
3.2.3 RT-PCR measurements 40
3.2.4 nCounter® measurements 40
3.2.5 GeneChip® analyses (only training cohorts) 41
3.3 Machine learning framework 41
3.3.1 Pre-processing of gene expression data 41
3.3.2 Determination of the ensemble gene signature 42
3.3.3 Expanding the ensemble signature by highly correlated genes 43
3.3.4 Independent validation and patient stratification 45
4 Identification of gene expression signatures as prognostic biomarkers 46
4.1 Hypoxia classification 46
4.2 nCounter® gene expression based signatures 50
4.2.1 Patients treated with primary radiochemotherapy 50
4.2.2 Clinical Features 55
4.2.3 Signature extension using clinical features 64
4.2.4 Patients treated with postoperative radiochemotherapy 65
4.2.5 Signature extension using clinical features – Port-C 72
4.3 GeneChip® gene expression-based signatures 78
4.3.1 Pre-selection 78
4.3.2 Patients treated with primary radiochemotherapy 79
4.3.3 Patients treated with postoperative radiochemotherapy 87
4.4 Combined models for PORT-C 91
4.4.1 Creation of a consensus model 92
4.4.2 Consensus model based on 2 models 93
4.4.3 Consensus model based on more than 2 models 97
4.4.4 Discussion and summary of model combination 101
5 Stability of gene expression-based biomarkers 102
5.1 Reproducibility depending on time of nCounter® 102
5.2 Comparison of nCounter® and GeneChip® gene expression 106
5.2.1 Introduction 106
5.2.2 Correlation analyses 106
5.2.3 Model and biomarker transfer 108
6 Conclusion and outlook 123
Zusammenfassung 125
Summary 128
Appendix 130
A. Supplementary Figures 130
B. Supplementary Tables 133
Bibliography 147
Acknowledgements 188
Erklärungen 189 / Due to heterogeneous tumour biology, the treatment response of locally advanced head and neck squamous cell carcinoma differs largely between patients, resulting in a mean 5-year survival of about 50%. In order to adapt the treatment to the properties of the tumour, the therapy resistance of the tumours must be assessed before treatment. In this thesis, gene expression data were analysed to identify novel gene signatures and models that allow for stratifying patients into risk groups with low and high risk of loco-regional tumour recurrence. To identify those signatures, methods from the field of machine learning were applied. For patients treated with postoperative radiochemotherapy, a 7-gene signature was developed and successfully validated. Furthermore, it was shown that several models based on different gene signatures may be equally suitable for patient stratification. A method is presented that combines those distinct prognostic models. In addition, gene-expression-based biomarkers were transferred between different gene expressions measurement methods with the result that signatures showed less variability in patient stratification than single-gene biomarkers.:Abbreviations VII
Figures IX
Tables XII
1 Introduction 1
2 Biological & Statistical Background 4
2.1 Head and Neck Squamous Cell Carcinoma 4
2.1.1 Tumorigenesis 4
2.1.2 Biomarkers 8
2.2 Statistics 14
2.2.1 Survival analysis 14
2.2.2 Model and data evaluation 18
2.2.3 Data sampling methods 22
2.3 Machine learning algorithms 23
2.3.1 Feature selection algorithms 24
2.3.2 Prognostic models 27
2.4 Gene expression measurement methods 30
2.4.1 Real-time polymerase chain reaction (RT-PCR) 31
2.4.2 nCounter® gene expression 32
2.4.3 In situ-synthesized oligonucleotide microarrays 32
3 Material and methods 35
3.1 Patient cohorts 35
3.1.1 Primary radiochemotherapy (pRCTx) cohorts 35
3.1.2 Postoperative radio(chemo)therapy (PORT-C) cohorts 36
3.1.3 Clinical endpoints 38
3.2 Gene expression analyses 39
3.2.1 HPV status 39
3.2.2 Immunohistochemical staining 39
3.2.3 RT-PCR measurements 40
3.2.4 nCounter® measurements 40
3.2.5 GeneChip® analyses (only training cohorts) 41
3.3 Machine learning framework 41
3.3.1 Pre-processing of gene expression data 41
3.3.2 Determination of the ensemble gene signature 42
3.3.3 Expanding the ensemble signature by highly correlated genes 43
3.3.4 Independent validation and patient stratification 45
4 Identification of gene expression signatures as prognostic biomarkers 46
4.1 Hypoxia classification 46
4.2 nCounter® gene expression based signatures 50
4.2.1 Patients treated with primary radiochemotherapy 50
4.2.2 Clinical Features 55
4.2.3 Signature extension using clinical features 64
4.2.4 Patients treated with postoperative radiochemotherapy 65
4.2.5 Signature extension using clinical features – Port-C 72
4.3 GeneChip® gene expression-based signatures 78
4.3.1 Pre-selection 78
4.3.2 Patients treated with primary radiochemotherapy 79
4.3.3 Patients treated with postoperative radiochemotherapy 87
4.4 Combined models for PORT-C 91
4.4.1 Creation of a consensus model 92
4.4.2 Consensus model based on 2 models 93
4.4.3 Consensus model based on more than 2 models 97
4.4.4 Discussion and summary of model combination 101
5 Stability of gene expression-based biomarkers 102
5.1 Reproducibility depending on time of nCounter® 102
5.2 Comparison of nCounter® and GeneChip® gene expression 106
5.2.1 Introduction 106
5.2.2 Correlation analyses 106
5.2.3 Model and biomarker transfer 108
6 Conclusion and outlook 123
Zusammenfassung 125
Summary 128
Appendix 130
A. Supplementary Figures 130
B. Supplementary Tables 133
Bibliography 147
Acknowledgements 188
Erklärungen 189

Identiferoai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:36498
Date09 December 2019
CreatorsSchmidt, Stefan
ContributorsLöck, Steffen, Alsner, Jan, Technische Universität Dresden
Source SetsHochschulschriftenserver (HSSS) der SLUB Dresden
LanguageGerman
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/publishedVersion, doc-type:doctoralThesis, info:eu-repo/semantics/doctoralThesis, doc-type:Text
Rightsinfo:eu-repo/semantics/openAccess
Relation10.1016/j.ctro.2019.03.002, 10.1158/1078-0432.CCR-17-2345

Page generated in 0.0024 seconds