• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 12
  • 3
  • 1
  • Tagged with
  • 20
  • 20
  • 9
  • 7
  • 6
  • 5
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Enhancing preprocessing and clustering of single-cell RNA sequencing data

Wang, Zhe 04 October 2021 (has links)
Single-cell RNA sequencing (scRNA-seq) is the leading technique for characterizing cellular heterogeneity in biological samples. Various scRNA-seq protocols have been developed that can measure the transcriptome from thousands of cells in a single experiment. With these methods readily available, the ability to transform raw data into biological understanding of complex systems is now a rate-limiting step. In this dissertation, I introduce novel computational software and tools which enhance preprocessing and clustering of scRNA-seq data and evaluate their performance compared to existing methods. First, I present scruff, an R/Bioconductor package that preprocesses data generated from scRNA-seq protocols including CEL-Seq or CEL-Seq2 and reports comprehensive data quality metrics and visualizations. scruff rapidly demultiplexes, aligns, and counts the reads mapped to genomic features with deduplication of unique molecular identifier (UMI) tags and provides novel and extensive functions to visualize both pre- and post-alignment data quality metrics for cells from multiple experiments. Second, I present Celda, a novel Bayesian hierarchical model that can perform simultaneous co-clustering of genes into transcriptional modules and cells into subpopulations for scRNA-seq data. Celda identified novel cell subpopulations in a publicly available peripheral blood mononuclear cell (PBMC) dataset and outperformed a PCA-based approach for gene clustering on simulated data. Third, I extend the application of Celda by developing a multimodal clustering method that utilizes both mRNA and protein expression information generated from single-cell sequencing datasets with multiple modalities, and demonstrate that Celda multimodal clustering captured meaningful biological patterns which are missed by transcriptome- or protein-only clustering methods. Collectively, this work addresses limitations present in the computational analyses of scRNA-seq data by providing novel methods and solutions that enhance scRNA-seq data preprocessing and clustering.
2

Inferring the Origin of Cells at the Maternal-Fetal Interface (FEMO)

Varley, Thomas 23 May 2022 (has links)
No description available.
3

Celltyper: A Single-Cell Sequencing Marker Gene Tool Suite

Paisley, Brianna Meadow 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Single-cell RNA-sequencing (scRNA-seq) has enabled researchers to study interindividual cellular heterogeneity, to explore disease impact on cellular composition of tissue, and to identify novel cell subtypes. However, a major challenge in scRNA-seq analysis is to identify the cell type of individual cells. Accurate cell type identification is crucial for any scRNA-seq analysis to be valid as incorrect cell type assignment will reduce statistical robustness and may lead to incorrect biological conclusions. Therefore, accurate and comprehensive cell type assignment is necessary for reliable biological insights into scRNA-seq datasets. With over 200 distinct cell types in humans alone, the concept of cell identity is large. Even within the same cell type there exists heterogeneity due to cell cycle phase, cell state, cell subtypes, cell health and the tissue microenvironment. This makes cell type classification a complicated biological problem requiring bioinformatics. One approach to classify cell type identity is using marker genes. Marker genes are genes specific for one or a few cell types. When coupled with bioinformatic methods, marker genes show promise of improving cell type classification. However, current scRNA-seq classification methods and databases use marker genes that are non-specific across sources, samples, and/or species leading to bias and errors. Furthermore, many existing tools require manual intervention by the user to provide training datasets or the expected number and name of cell types, which can introduce selection bias. The selection bias negatively impacts the accuracy of cell type classification methods as the model cannot extrapolate outside of the user inputs even when it is biologically meaningful to do so. In this dissertation I developed CellTypeR, a suite of tools to explore the biology governing cell identity in a “normal” state for humans and mice. The work presented here accomplishes three aims: 1. Develop an ontology standardized database of published marker gene literature; 2. Develop and apply a marker gene classification algorithm; and 3. Create user interface and input data structure for scRNA-seq cell type prediction.
4

Building an analytical framework for quality control and meta-analysis of single-cell data to understand heterogeneity in lung cancer cells

Hong, Rui 20 March 2024 (has links)
Single-cell RNA sequencing (scRNA-seq) has been a powerful technique for characterizing transcriptional heterogeneity related to tumor development and disease pathogenesis. Despite the advances of technology, there is still a lack of software to systematically and easily assess the quality and different types of artifacts present in scRNA-seq data and a statistical framework for understanding heterogeneity in the gene programs of cancer cells. In this dissertation, I first introduced novel computational software to enhance and streamline the process of quality control for scRNA-seq data called SCTK-QC. SCTK-QC is a pipeline that performs comprehensive quality control (QC) of scRNA-seq data and runs a multitude of tools to assess various types of noise present in scRNA-seq data as well as quantification of general QC metrics. These metrics are displayed in a user-friendly HTML report and the pipeline has been implemented in two cloud-based platforms. Most scRNA-seq studies only profiled a small number of tumors and provided a narrow view of the transcriptome in tumor tissue. Next, I developed a novel framework to perform a large-scale meta-analysis of cancer cells from 12 studies with scRNA-seq data from patients with non-small-cell lung cancer (NSCLC). I discovered interpretable gene co-expression modules with celda and demonstrated that the activity of gene modules accounted for both inter- and intra-tumor heterogeneity of NSCLC samples. Furthermore, I used CaDRa to determine that the levels of some gene modules were significantly associated with combinations of underlying genetic alterations. I also showed that other gene modules are associated with immune cell signatures and may be important for communication with the cancer cells and the immune microenvironment. Finally, I presented a novel computational method to study the association between copy number variation (CNV) and gene expression at the single-cell level. The diversity of the CNV profile was identified in tumor subclones within each sample and I discovered cis and trans gene signatures which have expression values associated with specific somatic CNV status. This study helped us prioritize the potential cancer driver genes within each CNV region. Collectively, this work addressed the limitation in the quality control of scRNA-seq data and provided insights for understanding the heterogeneity of NSCLC samples.
5

Structured Bayesian methods for splicing analysis in RNA-seq data

Huang, Yuanhua January 2018 (has links)
In most eukaryotes, alternative splicing is an important regulatory mechanism of gene expression that results in a single gene coding for multiple protein isoforms, thus largely increases the diversity of the proteome. RNA-seq is widely used for genome-wide splicing isoform quantification, and several effective and powerful methods have been developed for splicing analysis with RNA-seq data. However, it remains problematic for genes with low coverages or large number of isoforms. These difficulties may in principle be ameliorated by exploiting correlations encoded in the structured data sources. This thesis contributes to developments of Bayesian methods for splicing analysis by leveraging additional information in multiple datasets with structured prior distributions. First, we developed DICEseq, the first isoform quantification method tailored to time-series RNA-seq experiments. DICEseq explicitly models the correlations between experiments at different time points to aid the quantification of isoforms across experiments. Numerical experiments on both simulated and real datasets show that DICEseq yields more accurate results than state-of-the-art methods, an advantage that can become considerable at low coverage levels. Furthermore, DICEseq permits to quantify the trade-off between temporal sampling of RNA and depth of sequencing, frequently an important choice when planning experiments. Second, we developed BRIE (Bayesian Regression for Isoform Estimation), a Bayesian hierarchical model which resolves the difficulties in splicing analysis in single-cell RNA-seq (scRNA-seq) data by learning an informative prior distribution from sequence features. This method combines the quantification and imputation for splicing analysis via a Bayesian way, which is particularly useful in scRNA-seq data due to its extreme low coverages and high technical noises. We validated BRIE on several scRNA-seq data sets, showing that BRIE yields reproducible estimates of exon inclusion ratios in single cells. Third, we provided an effective tool by using Bayes factor to sensitively detect differential splicing between different single cells. When applying BRIE to a few real datasets, we found interesting heterogeneity patterns in splicing events across cell population, for example alternative exons in DNMT3B. In summary, this thesis proposes structured Bayesian methods to integrate multiple datasets to improve splicing analysis and study its biological functions.
6

Network analysis of human vitiligo scRNA-seq data reveals complex mechanisms of immune activation

Gellatly, Kyle 22 November 2021 (has links)
The advent of scRNA-seq has rapidly advanced our understanding of complex systems by enabling the researcher to look at the full transcriptional profile within each cell, with the potential to reveal intercellular communications within a tissue. To map these communications, I created SignallingSingleCell, an R package that provides an end-to-end approach for the analysis of scRNA-seq data, with a particular focus on building ligand and receptor signaling networks. Using these powerful techniques, we sought to dissect the heterogenous population of cells recently reported within the BMDC culture system. From this data we were able to determine the cell type composition, identify the different myeloid responses to similar stimuli, and unify recent conflicting studies about the populations within this system. We then applied these tools to study vitiligo, an autoimmune disease of the skin, to answer fundamental questions about the initiation and progression of disease. We found signatures of increased antigen presentation through MHC-I, loss of immunotolerance cytokines such as TGFB1 and IL-10, and changes in the complex chemokine circuits that influence T cell localization, including an essential role for CCR5 in Treg function. In order to identify and characterize the autoreactive T cells that are responsible for the targeted destruction of melanocytes, we then paired scRNA-seq with TCR-seq and MHC-II complexes loaded with melanocyte antigen. From this data we contrast the transcriptional state of melanocyte specific T cells to bystanders found within the skin and circulation.
7

Utilizing unlabeled data in cell type identification : A semi-supervised learning approach to classification

Quast, Thijs January 2020 (has links)
Recent research in bioinformatics has presented multiple cell type identification meth- dologies using single cell RNA sequence data (scRNA-seq). However, a consensus on which cell typing methodology consistently demonstrates superior performance remains absent. Additionally, very few studies approach cell type identification through a semi- supervised learning study, whereby the information in unlabeled data is leveraged to train an enhanced classifier. This paper presents cell annotation methodologies through self- learning and graph-based semi-supervised learning, in both raw count scRNA-seq data as well as in a latent embedding. I find that a self-learning framework enhances perfor- mance compared to a solely supervised learning classifier. Additionally, modelling on the latent data representations consistently outperforms modelling on the original data. The results show an overall accuracy of 96.12%, whereas additional models achieve an average precision rate of 95.12% and an average recall rate of 94.40%. The semi-supervised learn- ing approaches in this thesis compare favourable to scANVI in terms of accuracy, average precision rate, average recall rate and average f1-score. Moreover, results for alternative scenarios, in which cell types among training and test data do not perfectly overlap, are reported in this thesis.
8

Reconstruction of Cell and Tissue-specific Immune-protein Interactomes Using Single-cell RNA Sequencing Data

Althobaiti, Atheer 04 1900 (has links)
Protein molecules and their interactions via protein-protein interactions (PPIs) are at the core of cellular functions. While such global PPI networks have been useful for analyzing gene function and effects of genetic variants, they do not resolve tissue and cell-typespecific interactions. Here we leverage recent advances in single-cell RNA sequencing (scRNA-seq) to reconstruct cell-type-specific PPI networks across different tissues to enable a context-sensitive analysis of immune cells’ gene-protein pathways. Targeting B cells, T cells, and macrophage cells as a proof-of-principle, we used scRNA-seq data across different tissues from the Tabula Muris mouse consortium. We mapped the protein-coding DEGs to a protein-protein interaction network database (STRING v.11). Topological and global similarity analysis of the networks revealed distinct properties between tissues highlighting tissue-specific behaviors for each cell type. For example, we found that degree and clustering coefficients distributions were tissue-specific. Different cell types and tissues displayed specific characteristics, and in particular, the splenic PPI networks were different compared to other analyzed tissues for all the immune cell types examined. For example, the pairwise comparison of the Jaccard index for node similarity and the mantel test correlation analysis showed that the spleen’ node and PPI networks are more different than any other tissues for each cell type examined. The physiological and anatomical properties that distinguish the spleen from other examined tissues might explain why the splenic PPI networks tend to be less similar compared to other tissues. The cell-type-specific network analyses using the different distance measures between the adjacency matrices on the hub nodes such as Euclidean, Manhattan, Jaccard, and Hamming distances showed a macrophage-specific behavior not observed in B cells and T cells, confirming their lineage differences. Finally, we explored the rewiring of selected hub nodes and transcription factors in the PPI networks along with their biological enrichments to validate our observations. The suggested biological validity of our results confirms the relevance of data-driven reconstruction of these context-sensitive networks using more advanced network inference algorithms. In conclusion, scRNA-seq enables the reconstruction of global unspecific PPI networks into cell and tissue-specific networks, thereby providing an increased resolution of the biological context.
9

Network inference from sparse single-cell transcriptomics data: Exploring, exploiting, and evaluating the single-cell toolbox

Steinheuer, Lisa Maria 04 April 2022 (has links)
Large-scale transcriptomics data studies revolutionised the fields of systems biology and medicine, allowing to generate deeper mechanistic insights into biological pathways and molecular functions. However, conventional bulk RNA-sequencing results in the analysis of an averaged signal of many input cells, which are homogenised during the experimental procedure. Hence, those insights represent only a coarse-grained picture, potentially missing information from rare or unidentified cell types. Allowing for an unprecedented level of resolution, single-cell transcriptomics may help to identify and characterise new cell types, unravel developmental trajectories, and facilitate inference of cell type-specific networks. Besides all these tempting promises, there is one main limitation that currently hampers many downstream tasks: single-cell RNA-sequencing data is characterised by a high degree of sparsity. Due to this limitation, no reliable network inference tools allowed to disentangle the hidden information in the single-cell data. Single-cell correlation networks likely hold previously masked information and could allow inferring new insights into cell type-specific networks. To harness the potential of single-cell transcriptomics data, this dissertation sought to evaluate the influence of data dropout on network inference and how this might be alleviated. However, two premisses must be met to fulfil the promise of cell type-specific networks: (I) cell type annotation and (II) reliable network inference. Since any experimentally generated scRNA-seq data is associated with an unknown degree of dropout, a benchmarking framework was set up using a synthetic gold data set, which was subsequently affected with different defined degrees of dropout. Aiming to desparsify the dropout-afflicted data, the influence of various imputations tools on the network structure was further evaluated. The results highlighted that for moderate dropout levels, a deep count autoencoder (DCA) was able to outperform the other tools and the unimputed data. To fulfil the premiss of cell type annotation, the impact of data imputation on cell-cell correlations was investigated using a human retina organoid data set. The results highlighted that no imputation tool intervened with cell cluster annotation. Based on the encouraging results of the benchmarking analysis, a window of opportunity was identified, which allowed for meaningful network inference from imputed single-cell RNA-seq data. Therefore, the inference of cell type-specific networks subsequent to DCA-imputation was evaluated in a human retina organoid data set. To understand the differences and commonalities of cell type-specific networks, those were analysed for cones and rods, two closely related photoreceptor cell types of the retina. Comparing the importance of marker genes for rods and cones between their respective cell type-specific networks exhibited that these genes were of high importance, i.e. had hub-gene-like properties in one module of the corresponding network but were of less importance in the opposing network. Furthermore, it was analysed how many hub genes in general preserved their status across cell type-specific networks and whether they associate with similar or diverging sub-networks. While a set of preserved hub genes was identified, a few were linked to completely different network structures. One candidate was EIF4EBP1, a eukaryotic translation initiation factor binding protein, which is associated with a retinal pathology called age-related macular degeneration (AMD). These results suggest that given very defined prerequisites, data imputation via DCA can indeed facilitate cell type-specific network inference, delivering promising biological insights. Referring back to AMD, a major cause for the loss of central vision in patients older than 65, neither the defined mechanisms of pathogenesis nor treatment options are at hand. However, light can be shed on this disease through the employment of organoid model systems since they resemble the in vivo organ composition while reducing its complexity and ethical concerns. Therefore, a recently developed human retina organoid system (HRO) was investigated using the single-cell toolbox to evaluate whether it provides a useful base to study the defined effects on the onset and progression of AMD in the future. In particular, different workflows for a robust and in-depth annotation of cell types were used, including literature-based and transfer learning approaches. These allowed to state that the organoid system may reproduce hallmarks of a more central retina, which is an important determinant of AMD pathogenesis. Also, using trajectory analysis, it could be detected that the organoids in part reproduce major developmental hallmarks of the retina, but that different HRO samples exhibited developmental differences that point at different degrees of maturation. Altogether, this analysis allowed to deeply characterise a human retinal organoid system, which revealed in vivo-like outcomes and features as pinpointing discrepancies. These results could be used to refine culture conditions during the organoid differentiation to optimise its utility as a disease model. In summary, this dissertation describes a workflow that, in contrast to the current state of the art in the literature enables the inference of cell type-specific gene regulatory networks. The thesis illustrated that such networks indeed differ even between closely related cells. Thus, single-cell transcriptomics can yield unprecedented insights into so far not understood cell regulatory principles, particularly rare cell types that are so far hardly reflected in bulk-derived RNA-seq data.
10

Mapping the Immune Landscape in Endemic Burkitt Lymphoma Tumors and Developing a Humanized Mouse Model for Exploring Inter-Patient Tumor Variation

Saikumar Lakshmi, Priya 29 November 2021 (has links)
Endemic Burkitt lymphoma (eBL) is the leading pediatric cancer in sub-Saharan Africa and is associated with Epstein-Barr virus (EBV) and Plasmodium falciparum malaria co-infections. Current treatment options in Africa are combination chemotherapy with a survival rate hovering around 50%. Relapsed or refractory eBL patients have failed to receive any targeted treatments in the clinic. Our focus was to delineate immune responses in eBL, interrogate the tumor variation in responses to targeted treatments and develop mouse models that can be used to target essential mediators of tumor pathogenesis. Immune-based treatments including immune checkpoint inhibition have recently become an effective therapeutic modality in oncology. However, some B cell lymphomas such as Hodgkin Lymphoma (HL), are more receptive to checkpoint inhibition than others suggesting a need to understand the efficacy of checkpoint inhibition on different lymphoma subtypes. Checkpoint inhibitors act by blocking inhibitory receptors on T cells and improving anti-tumor responses. One of the goals of this thesis was to characterize checkpoint inhibitors on Tumor-infiltrating lymphocytes (TILs) in eBL tumors and to identify T cell subsets that exhibit increased expression of inhibitory receptors, poor cytokine production, poor proliferation and express transcription factors associated with exhaustion. Using scRNA seq, we identified T cell clusters that co-expressed inhibitory receptors, poor proliferative markers but also sustained costimulatory signals, as well as cytokine expression suggesting a pre dysfunctional state and not terminally exhausted state. Furthermore, we quantified the dominant co-inhibitory receptors PD1 and TIGIT that are upregulated in the tumor microenvironment via immunohistochemistry (IHC) and in peripheral blood of eBL patients via flow cytometry. We compared eBL patients with healthy pediatric cohorts with a history of persistent malaria exposure to those who had little to no malaria infections, to understand uniquely T cell mediated responses in BL children. Tumors had high co-expression of PD1 and TIGIT but fewer PD1 only populations, suggesting that both ligands may play a role in restraining immune activation via IHC. Next, we investigated if PD1 ligands or TIGIT ligands were overexpressed in eBL tumors. Nectin-2, TIGIT ligand was highly expressed in eBL tumors but was not highly correlated with TIGIT expression. These studies provide insights for PD1/ TIGIT blockade in Burkitt lymphoma patients. Additionally, we established new patient-derived cell lines from eBL tumors to study tumor variation and to study targeted treatments. We established five new patient-derived eBL lines BL717, BL 719, BL720, BL725, and BL740 that were interrogated for their inter-patient variation by studying their gene expression profiles. Further, we developed a patient cell-line derived xenograft (CDX) mouse model by injecting newly patient-derived BL cell lines in immunodeficient mice (NSG BL) and studying BL tumorigenesis. Having successfully established NSG BL tumors, we observed differences in tumor growth sensitivity and survival. We tested rituximab efficacy, one of the most established treatments for B cell lymphomas in our mouse model. We also identified pathways associated with unfolded protein response (UPR) and the mammalian target of rapamycin (mTOR) signaling, as well as apoptosis in one of the cell line xenografts, BL740, in response to rituximab. BL717, BL720 cell line xenograft failed to control tumor growth and was enriched in IFN-ɑ signature genes. This mouse model will prove to be useful to study combination therapy against eBL tumors as well as mechanisms of resistance to drug targets. Collectively, these studies provide insights into intratumoral variation including subtypes during tumor progression and expression profiles of TILs in eBL tumors. This will be important in designing new therapeutic strategies as well as help pose novel therapeutic targets.

Page generated in 0.0162 seconds