Global ETD Search

11	Classification of brain tumors in weakly annotated histopathology images with deep learning Hrabovszki, Dávid January 2021 (has links) Brain and nervous system tumors were responsible for around 250,000 deaths in 2020 worldwide. Correctly identifying different tumors is very important, because treatment options largely depend on the diagnosis. This is an expert task, but recently machine learning, and especially deep learning models have shown huge potential in tumor classification problems, and can provide fast and reliable support for pathologists in the decision making process. This thesis investigates classification of two brain tumors, glioblastoma multiforme and lower grade glioma in high-resolution H&E-stained histology images using deep learning. The dataset is publicly available from TCGA, and 220 whole slide images were used in this study. Ground truth labels were only available on whole slide level, but due to their large size, they could not be processed by convolutional neural networks. Therefore, patches were extracted from the whole slide images in two sizes and fed into separate networks for training. Preprocessing steps ensured that irrelevant information about the background was excluded, and that the images were stain normalized. The patch-level predictions were then combined to slide level, and the classification performance was measured on a test set. Experiments were conducted about the usefulness of pre-trained CNN models and data augmentation techniques, and the best method was selected after statistical comparisons. Following the patch-level training, five slide aggregation approaches were studied, and compared to build a whole slide classifier model. Best performance was achieved when using small patches (336 x 336 pixels), pre-trained CNN model without frozen layers, and mirroring data augmentation. The majority voting slide aggregation method resulted in the best whole slide classifier with 91.7% test accuracy and 100% sensitivity. In many comparisons, however, statistical significance could not be shown because of the relatively small size of the test set. medical imaging deep learning glioblastoma tcga brain tumor weakly supervised histopathology digital pathology histology Medical Image Processing Medicinsk bildbehandling
12	Identification of Key Biomarkers in Bladder Cancer: Evidence from a Bioinformatics Analysis Zhang, Chuan, Berndt-Paetz, Mandy, Neuhaus, Jochen 18 April 2023 (has links) Bladder cancer (BCa) is one of the most common malignancies and has a relatively poor outcome worldwide. However, the molecular mechanisms and processes of BCa development and progression remain poorly understood. Therefore, the present study aimed to identify candidate genes in the carcinogenesis and progression of BCa. Five GEO datasets and TCGA-BLCA datasets were analyzed by statistical software R, FUNRICH, Cytoscape, and online instruments to identify differentially expressed genes (DEGs), to construct protein‒protein interaction networks (PPIs) and perform functional enrichment analysis and survival analyses. In total, we found 418 DEGs. We found 14 hub genes, and gene ontology (GO) analysis revealed DEG enrichment in networks and pathways related to cell cycle and proliferation, but also in cell movement, receptor signaling, and viral carcinogenesis. Compared with noncancerous tissues, TPM1, CRYAB, and CASQ2 were significantly downregulated in BCa, and the other hub genes were significant upregulated. Furthermore, MAD2L1 and CASQ2 potentially play a pivotal role in lymph nodal metastasis. CRYAB and CASQ2 were both significantly correlated with overall survival (OS) and disease-free survival (DFS). The present study highlights an up to now unrecognized possible role of CASQ2 in cancer (BCa). Furthermore, CRYAB has never been described in BCa, but our study suggests that it may also be a candidate biomarker in BCa. info:eu-repo/classification/ddc/610 ddc:610
13	Biomarker Identification for Breast Cancer Types Using Feature Selection and Explainable AI Methods La Rosa Giraud, David E 01 January 2023 (has links) (PDF) This paper investigates the impact the LASSO, mRMR, SHAP, and Reinforcement Feature Selection techniques on random forest models for the breast cancer subtypes markers ER, HER2, PR, and TN as well as identifying a small subset of biomarkers that could potentially cause the disease and explain them using explainable AI techniques. This is important because in areas such as healthcare understanding why the model makes a specific decision is important it is a diagnostic of an individual which requires reliable AI. Another contribution is using feature selection methods to identify a small subset of biomarkers capable of predicting if a specific RNA sequence will have one of the cancer labels positive. The study begins by obtaining baseline accuracy metric using a random forest model on The Cancer Genome Atlas's breast cancer database to then explore the effects of feature selection, selecting different numbers of features, significantly influencing model accuracy, and selecting a small number of potential biomarkers that may produce a specific type of breast cancer. Once the biomarkers were selected, the explainable AI techniques SHAP and LIME were applied to the models and provided insight into influential biomarkers and their impact on predictions. The main results are that there are some shared biomarkers between some of the subsets that had high influence over the model prediction, LASSO and Reinforcement Feature selection sets scoring the highest accuracy of all sets and obtaining some insight into how the models used the features by using existing explainable AI methods SHAP and LIME to understand how these selected features are affecting the model's prediction. Machine Learning TCGA Explainable AI Biomarkers Breast Cancer Subtypes Feature Selection Artificial Intelligence and Robotics Computer Sciences Genetics
14	Learning Genetic Networks Using Gaussian Graphical Model and Large-Scale Gene Expression Data Zhao, Haitao 25 August 2020 (has links) No description available. Bioinformatics Computer Science Gaussian Graphical Model RNA-seq Expression TCGA Genetic Networks
15	Modelagem e implementação de banco de dados clínicos e moleculares de pacientes com câncer e seu uso para identificação de marcadores em câncer de pâncreas / Database design and implementation of clinical and molecular data of cancer patients and its application for biomarker discovery in pancreatic cancer Bertoldi, Ester Risério Matos 20 October 2017 (has links) O adenocarcinoma pancreático (PDAC) é uma neoplasia de difícil diagnóstico precoce e cujo tratamento não tem apresentado avanços expressivos desde a última década. As tecnologias de sequenciamento de nova geração (next generation sequencing - NGS) podem trazer importantes avanços para a busca de novos marcadores para diagnóstico de PDACs, podendo também contribuir para o desenvolvimento de terapias individualizadas. Bancos de dados são ferramentas poderosas para integração, padronização e armazenamento de grandes volumes de informação. O objetivo do presente estudo foi modelar e implementar um banco de dados relacional (CaRDIGAn - Cancer Relational Database for Integration and Genomic Analysis) que integra dados disponíveis publicamente, provenientes de experimentos de NGS de amostras de diferentes tipos histopatológicos de PDAC, com dados gerados por nosso grupo no IQ-USP, facilitando a comparação entre os mesmos. A funcionalidade do CaRDIGAn foi demonstrada através da recuperação de dados clínicos e dados de expressão gênica de pacientes a partir de listas de genes candidatos, associados com mutação no oncogene KRAS ou diferencialmente expressos em tumores identificados em dados de RNAseq gerados em nosso grupo. Os dados recuperados foram utilizados para a análise de curvas de sobrevida que resultou na identificação de 11 genes com potencial prognóstico no câncer de pâncreas, ilustrando o potencial da ferramenta para facilitar a análise, organização e priorização de novos alvos biomarcadores para o diagnóstico molecular do PDAC. / Pancreatic Ductal Adenocarcinoma (PDAC) is a type of cancer difficult to diagnose early on and treatment has not improved over the last decade. Next Generation Sequencing (NGS) technology may contribute to discover new biomarkers, develop diagnose strategies and personalised therapy applications. Databases are powerfull tools for data integration, normalization and storage of large data volumes. The main objective of this study was the design and implementation of a relational database to integrate publicly available data of NGS experiments of PDAC pacients with data generated in by our group at IQ-USP, alowing comparisson between both data sources. The database was called CaRDIGAn (Cancer Relational Database for Integration and Genomic Analysis) and its funcionalities were tested by retrieving clinical and expression data of public data of genes differencially expressed genes in our samples or genes associated with KRAS mutation. The output of those queries were used to fit survival curves of patients, which led to the identification of 11 genes potencially usefull for PDAC prognosis. Thus, CaRDIGAn is a tool for data storage and analysis, with promissing applications to identification and priorization of new biomarkers for molecular diagnosis in PDAC. Banco de dados Cancer Câncer de pâncreas CaRDIGAn CaRDIGAn Database Database design Ensembl ICGC Modelo entidade-relacionamento NGS Pancreatic ductal adenocarcinoma Relational database TCGA
16	Modelagem e implementação de banco de dados clínicos e moleculares de pacientes com câncer e seu uso para identificação de marcadores em câncer de pâncreas / Database design and implementation of clinical and molecular data of cancer patients and its application for biomarker discovery in pancreatic cancer Ester Risério Matos Bertoldi 20 October 2017 (has links) O adenocarcinoma pancreático (PDAC) é uma neoplasia de difícil diagnóstico precoce e cujo tratamento não tem apresentado avanços expressivos desde a última década. As tecnologias de sequenciamento de nova geração (next generation sequencing - NGS) podem trazer importantes avanços para a busca de novos marcadores para diagnóstico de PDACs, podendo também contribuir para o desenvolvimento de terapias individualizadas. Bancos de dados são ferramentas poderosas para integração, padronização e armazenamento de grandes volumes de informação. O objetivo do presente estudo foi modelar e implementar um banco de dados relacional (CaRDIGAn - Cancer Relational Database for Integration and Genomic Analysis) que integra dados disponíveis publicamente, provenientes de experimentos de NGS de amostras de diferentes tipos histopatológicos de PDAC, com dados gerados por nosso grupo no IQ-USP, facilitando a comparação entre os mesmos. A funcionalidade do CaRDIGAn foi demonstrada através da recuperação de dados clínicos e dados de expressão gênica de pacientes a partir de listas de genes candidatos, associados com mutação no oncogene KRAS ou diferencialmente expressos em tumores identificados em dados de RNAseq gerados em nosso grupo. Os dados recuperados foram utilizados para a análise de curvas de sobrevida que resultou na identificação de 11 genes com potencial prognóstico no câncer de pâncreas, ilustrando o potencial da ferramenta para facilitar a análise, organização e priorização de novos alvos biomarcadores para o diagnóstico molecular do PDAC. / Pancreatic Ductal Adenocarcinoma (PDAC) is a type of cancer difficult to diagnose early on and treatment has not improved over the last decade. Next Generation Sequencing (NGS) technology may contribute to discover new biomarkers, develop diagnose strategies and personalised therapy applications. Databases are powerfull tools for data integration, normalization and storage of large data volumes. The main objective of this study was the design and implementation of a relational database to integrate publicly available data of NGS experiments of PDAC pacients with data generated in by our group at IQ-USP, alowing comparisson between both data sources. The database was called CaRDIGAn (Cancer Relational Database for Integration and Genomic Analysis) and its funcionalities were tested by retrieving clinical and expression data of public data of genes differencially expressed genes in our samples or genes associated with KRAS mutation. The output of those queries were used to fit survival curves of patients, which led to the identification of 11 genes potencially usefull for PDAC prognosis. Thus, CaRDIGAn is a tool for data storage and analysis, with promissing applications to identification and priorization of new biomarkers for molecular diagnosis in PDAC. Banco de dados Câncer de pâncreas CaRDIGAn Ensembl ICGC Modelo entidade-relacionamento NGS TCGA Cancer CaRDIGAn Database Database design Pancreatic ductal adenocarcinoma Relational database
17	Angiogenic gene signature in human pancreatic cancer correlates with TGF-beta and inflammatory transcriptomes Craven, Kelly E. 11 April 2016 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Pancreatic ductal adenocarcinoma (PDAC), which comprises 85% of pancreatic cancers, is the 4th leading cause of cancer death in the United States with a 5-year survival rate of 8%. While human PDACs (hPDACs) are hypovascular, they also overexpress a number of angiogenic growth factors and receptors. Additionally, the use of anti-angiogenic agents in murine models of PDAC leads to reduced tumor volume, tumor spread, and microvessel density (MVD), and improved survival. Nonetheless, clinical trials using anti-angiogenic therapy have been overwhelmingly unsuccessful in hPDAC. On the other hand, pancreatic neuroendocrine tumors (PNETs) account for only 2% of pancreatic tumors, yet they are very vascular and classically angiogenic, respond to anti-angiogenic therapy, and confer a better prognosis than PDAC even in the metastatic setting. In an eﬀort to compare and contrast the angiogenic transcriptomes of these two tumor types, we analyzed RNA-Sequencing (RNA-Seq) data from The Cancer Genome Atlas (TCGA) and found that a pro-angiogenic gene signature is present in 35% of PDACs and that it is mostly distinct from the angiogenic signature present in PNETs. The pro-angiogenic PDAC subgroup also exhibits a transcriptome that reﬂects active TGF-β signaling, less frequent SMAD4 inactivation than PDACs without the signature, and up-regulation of several pro-inﬂammatory genes, including members of JAK signaling pathways. Consequently, targeting the TGF-β receptor type-1 kinase with SB505124 and JAK1/2 with ruxolitinib blocks proliferative crosstalk between human pancreatic cancer cells (PCCs) and human endothelial cells (ECs). Additionally, treatment of the KRC (oncogenic Kras, homozygous deletion of Rb1) and KPC (oncogenic Kras, mutated Trp53) genetically engineered PDAC mouse models with ruxolitinib suppresses murine PDAC (mPDAC) progression only in the KRC model, which shows superior enrichment and diﬀerential expression of the human pro-angiogenic gene signature as compared to KPC tumors. These ﬁndings suggest that targeting both TGF-β and JAK signaling in the 35% of PDAC patients whose cancers exhibit an pro-angiogenic gene signature should be explored in a clinical trial. TCGA TGF-beta Angiogenesis Inflammation Mouse model Pancreatic cancer Pancreas -- Diseases Vascular endothelial growth factors Antineoplastic agents Cancer -- Treatment Neovascularization Pancreas -- Tumors
18	Detection and Classification of Cancer and Other Noncommunicable Diseases Using Neural Network Models Gore, Steven Lee 07 1900 (has links) Here, we show that training with multiple noncommunicable diseases (NCDs) is both feasible and beneficial to modeling this class of diseases. We first use data from the Cancer Genome Atlas (TCGA) to train a pan cancer model, and then characterize the information the model has learned about the cancers. In doing this we show that the model has learned concepts that are relevant to the task of cancer classification. We also test the model on datasets derived independently of the TCGA cohort and show that the model is robust to data outside of its training distribution such as precancerous legions and metastatic samples. We then utilize the cancer model as the basis of a transfer learning study where we retrain it on other, non-cancer NCDs. In doing so we show that NCDs with very differing underlying biology contain extractible information relevant to each other allowing for a broader model of NCDs to be developed with existing datasets. We then test the importance of the samples source tissue in the model and find that the NCD class and tissue source may not be independent in our model. To address this, we use the tissue encodings to create augmented samples. We test how successfully we can use these augmented samples to remove or diminish tissue source importance to NCD class through retraining the model. In doing this we make key observations about the nature of concept importance and its usefulness in future neural network explainability efforts. Cancer Neural network VAE generative augmented data methylation variational autoencoder CpG island TCGA schizophrenia asthma arthritis transfer learning TCAV Biology, Bioinformatics Computer Science

Search results