501 |
Molecular and biochemical analysis of water stress induced responses in grapeKatam, Ramesh, January 2008 (has links)
Thesis (Ph.D.)--Mississippi State University. Department of Plant and Soil Science. / Title from title screen. Includes bibliographical references.
|
502 |
Análise da heterocedasticidade em bubalinos utilizando informações genômicasGoes, Túlio José de Freitas [UNESP] 14 July 2014 (has links) (PDF)
Made available in DSpace on 2015-03-03T11:52:53Z (GMT). No. of bitstreams: 0
Previous issue date: 2014-07-14Bitstream added on 2015-03-03T12:06:49Z : No. of bitstreams: 1
000810899.pdf: 268181 bytes, checksum: 6ad3d008125d4cbb91861fe84019a265 (MD5) / É de extrema importância saber a natureza das variâncias e como essas afetam as característias produtivas e estimativas. Foram utilizadas aproximadamente 3500 bufálas em lactação, predominantemente da raça Murrah, genotipadas através do chip “Illumina Infinium bovine HD bead Chip” , obtidos do banco de dados do campus de Jaboticabal da UNESP, para se avaliar a produção de leite acumulada aos 305 dias(PL), produção de proteína(PPRO) e produção de gordura(PGOR) neste animais, estimando-se parâmetros de herdabilidade e variâncias Foi utilizado um modelo misto com duas propostas distintas, uma tradicional e outra onde é considerada a heterogeneidade das variâncias, tanto aditiva quanto residual, dividindo os animais em dois grupos de acordo com o nível de produção. Dentro de ambos os modelos, foram utilizadas informações de genotipagem. As estimativas de herdabilidade foram moderadas e dentro do que se encontra na literatura para todos os modelos, entretanto nos modelos onde se considera a heterogeneidade das variâncias foram encontradas diferentes estimativas. As herdabilidades encontradas para os modelos tradicionais foram de 0,22(sem genômica) e 0,23(com genômica) para PL, 0,32(sem genômica) e 0,30(com genômica) para PGOR e 0,27(sem genômica) e 0,34(com genômica) para PPRO. Ao se considerar heterogeneidade residual, as herdabilidades para os níveis alto e baixo de produção foram, respectivamente: 0,28 e 0,17 para PL, 0,28 e 0,35 para PGOR e 0,30 e 0,42 para PPRO. Quando também se considerou a heterogeneidade aditiva, as herdabilidades para os níveis alto e baixo de produção, respectivamente, foram de: 0,38 e 0,25 para PL, 0,29 e 0,31 para PGORD e 0,33 e ... / Is extreme important knowing the variability natures and how those affect production characteristics and population estimations like heritability and correlations. It was used in this research proximally 3500 buffaloes in lactation, most being murrah, genotyped with IlluminaInfiniumbovineHDbeadChip, taken from the database of Jaboticabal Campus, UNESP for the evaluation of milk yield at 305 days(MY), protein yield(PY) and fat yield(F), estimating heritability and variances effects on it. It was used a mixed model with two different approaches, one where doesn’t account the heterogeneity of variances, and another where is accounted these variances, both residual and additive, splitting the animals in two groups for production levels. In both models were included genomic informations. And by replacing the relationship matrix with matrix H, including genomic informations, the same results were studied but in models accounting genomic information. The heritability found, all of them, were moderate and between the intervals estimated in other studies. However in the models where different variances are in consideration, were found different heritabilitys for all characteristics. For the tradicional models the heritabilities found, without and with genomic information, were: 0.22 and 0.23 for MY, 0.32 and 0.30 for FY and 0.27 and 0.34 for PY. When take in account only the residual heterogeneity, the heritability for high and low level of production, were: 0.28 and 0.17 for MY, 0.28 and 0.35 for FY and 0.30 and 0.42 for PY. When the additive heterogeneity was also taken in account, the heritabilitys for high and low level of production were: 0.28 and 0.17 for MY, 0.29 and 0.31 for FY and 0.33 and 0.37 for PY. By the results of this studies is possible to concluded that taking in account ...
|
503 |
The genomic epidemiology of Campylobacter from the Republic of South Africavan Rensburg, Melissa Jansen January 2015 (has links)
As the leading cause of bacterial gastroenteritis, Campylobacter represents a significant public health burden; however, our knowledge of its epidemiology in low- and middle-income countries remains limited. Recent studies have demonstrated the power of whole-genome sequencing (WGS) for public health microbiology. The primary aim of this thesis was to exploit WGS to improve our understanding of the epidemiology of Campylobacter from the Republic of South Africa, a middle-income country. In the first half of this thesis, in silico approaches were developed to evaluate diagnostic assays and methods of species identification. Large-scale analyses of publicly available WGS data identified a robust real-time PCR assay for the detection of Campylobacter jejuni and Campylobacter coli, the primary causes of human campylobacteriosis. Evaluation of in silico speciation methods demonstrated that the atpA gene and ribosomal multilocus sequence typing can be used to identify Campylobacter from WGS data. The second half of this thesis extended concepts developed in the first half to investigate the epidemiology of Campylobacter from animals and humans from South Africa. Isolates from a study of Campylobacter from free-range broiler carcasses belonged to the agriculture-associated ST-828 lineage, but were atypically homogenous and differed at only 46/1,513 (3%) loci, providing novel insights into clonal infections in chickens. Analyses of human disease isolates collected in Cape Town in 1991, 2011, and 2012 confirmed that the local epidemiology of Campylobacter is distinct from that of high-income countries: in addition to major agriculture-associated C. jejuni and C. coli lineages, a putative novel C. jejuni subsp. jejuni/C. jejuni subsp. doylei hybrid clade and genetically diverse C. jejuni subsp. doylei and C. upsaliensis isolates were identified. This work delivers further evidence of the utility of WGS for clinical microbiology, presents approaches that address general problems in Campylobacter diagnostics and public health microbiology, and provides insights into the epidemiology of this important group of pathogens in South Africa.
|
504 |
Estudo de seleção genômica para características de produção e qualidade do leite de búfalas /Barros, Camila da Costa. January 2017 (has links)
Orientador: Humberto Tonhati / Coorientador: Rusbel Raul Aspilcueta-Borquis / Coorientador: Daniel Jordan de Abreu Santos / Banca: Roberto Carvalheiro / Banca: Francisco Ribeiro de Araujo Neto / Banca: Leonardo de Oliveira Seno / Banca: Guilherme Costa Venturini / Resumo: Objetivou-se com o presente trabalho comparar diferentes métodos Bayesianos de predição genômica para as características de produção de leite (PL) e as porcentagens de gordura (%G) e proteína (%P) no leite de búfalas e, realizar um estudo de associação genômica ampla, a fim de identificar regiões cromossômicas e genes possivelmente relacionados às mesmas, utilizando informações de indivíduos genotipados e não genotipados. O número de animais com fenótipo foi 3.355, o arquivo de pedigree continha 15.495 animais, dos quais 322 foram genotipados com o 90 K Axiom® Buffalo Genotyping array. Os seguintes critérios de controle de qualidade dos SNPs foram utilizados: MAF < 0,05; Call Rate < 0,95 e Equilíbrio de Hardy-Weinberg p-value < 10-6. Em relação à amostra foi considerado call rate <0,90. Para as predições genômicas, os seguintes modelos Bayesianos foram utilizados: Bayes A (BA), Bayes B (BB), Bayes C (BC) e Bayes LASSO (BL). O fenótipo corrigido para os efeitos fixos (Y*) foi utilizado como variável resposta nas análises genômicas. A habilidade de predição dos diferentes modelos foi avaliada usando o método leave-one-out de validação cruzada. As acurácias de predição foram calculadas através da correlação de Pearson entre o valor genético genômico estimado (GEBV) e a variável resposta (Y*) para cada modelo e característica avaliados. Em relação ao estudo de associação genômica ampla, um processo iterativo foi realizado para calcular os pesos dos marcadores em função do quadrad... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: The aim of this study was to compare different Bayesian methods of genomic prediction for milk yield (MY), fat (%F) and protein (%P) percentages in dairy buffaloes in Brazil, and to perform a genome-wide association study for the purpose of identify chromosomal regions and genes possibly related to the these traits, using information from genotyped and non-genotyped individuals. The number of animals with phenotype was 3,355, the pedigree file contained 15,495 animals, of which 322 were genotyped. The animals were genotyped using a 90K SNP panel (Axiom® Buffalo Genotyping Array). The following criteria for quality control of SNPs were used: MAF < 0.05, Call Rate < 0.95 and Hardy-Weinberg Equilibrium p-value < 10-6 . In relation to the sample, a Call Rate <0.90 was used. Four methods for genomic prediction were used: Bayes A (BA), Bayes B (BB), Bayes C (BC) and Bayes LASSO (BL). Phenotypes for the fixed effects (Y*) were used as response variables. The predictive ability of the different models was evaluated using a leave-one-out cross-validation approach. The prediction accuracy was calculated by Pearson's correlation between estimated genomic genetic value (GEBV) and response variable (Y*) for each model. In relation to genome-wide association studies, an iterative process was performed to derive SNP weights as function of squares of SNP effects and allele frequencies (ssGWAS). In general, all Bayesian models showed similar prediction accuracy, ranging from 0.41 to 0.42, 0.3... (Complete abstract click electronic access below) / Doutor
|
505 |
Mapping and functional characterisation of the Atlantic salmon genome and its regulation of pathogen responseGonen, Serap January 2015 (has links)
Atlantic salmon is a species of both scientific and economic importance, and Atlantic salmon farming is a highly profitable industry worldwide. One of the biggest challenges being faced by farms, which affects production efficiency and results in severe economic loss, is disease. In livestock production, one of the approaches taken to limit the impact of disease outbreaks is to selectively breed for improved resistance within farmed populations. Although traditional family-based resistance breeding programs have shown improvements in resistance to a variety of bacterial, viral and parasitic diseases on Atlantic salmon farms, response to selection can be slow. One way of increasing selection efficiency is through the incorporation of genetic markers into breeding programs, for marker-assisted or genomic selection. However, genomic resources for cultured aquatic species are sparse, and the generation of new and denser resources for use in selective breeding programs would be advantageous. The main focus of this thesis is the development of genomic resources in Atlantic salmon and the application of those resources to gain a better understanding of the salmon genome, particularly in the genetic basis of host resistance to infectious diseases. The first aim of this thesis was to develop improved genomic resources for Atlantic salmon, and to characterise the Atlantic salmon genome via construction and analysis of a SNP linkage map derived from RAD-Sequencing (RAD-Seq). Approximately 6,500 SNPs were assigned to 29 linkage groups, and ~1,800 male-segregating, and ~1,400 female-segregating SNPs were ordered and positioned. Overall map lengths and recombination ratios were relatively consistent between the sexes and across the linkage groups (~1:1.5, male:female). However, a substantial difference in the degree of marker clustering was seen between males and females, which is reflective of the difference in the positions of chiasmata between the two sexes. Using this map, ~4,000 Atlantic salmon reference genome contigs were assigned to a linkage group, and 112 contigs were assigned to multiple linkage groups, highlighting regions of homeology (large sections of duplicated chromosomal regions) within the salmon genome. Alignment of SNP-flanking sequences to the stickleback and rainbow trout genomes identified putative gene-associated SNPs and cross-species chromosomal orthologies, and provided evidence in support of the salmonid-specific genome duplication. In addition, based on this and other publically available RAD-Seq datasets, the utility of RAD-Seq-derived data from different species and laboratories for population genetics analyses was tested. Short RAD-Seq contigs in Atlantic salmon and nine other teleost fish were used to identify cross-species orthologous genomic relationships. Several thousands of orthologous RAD loci were identified across the species, with the number of RAD loci decreasing with evolutionary distance, as expected. Previously published broad-level relationships between orthologous chromosomes were confirmed. The identified cross-species orthologous RAD loci were used to estimate evolutionary relationships between the ten teleost fish species. Previously published relationships were recovered, suggesting that RAD-Seq data derived from different laboratories is useful for this purpose. The second aim was to characterise the genetic architecture of resistance to two viral diseases affecting Atlantic salmon production on farms: pancreas disease (PD), and infectious pancreatic necrosis (IPN). Using data and samples collected from a large population of salmon fry challenged with PD, a high heritability for resistance was estimated (h2 ~0.5), and four QTL were identified, on chromosomes 3, 4, 7 and 23. The QTL explaining the highest within-family variation for resistance was located on chromosome 3. This QTL has been confirmed in a population of post-smolts by an independent research group, highlighting the potential for its incorporation into breeding programs to improve PD resistance. For IPN, the major resistance QTL had previously been mapped to linkage group 21. However, the mutation(s) underlying this QTL effect and the consequences of these mutation(s) on the affected genes and relevant biological resistance mechanisms are unknown. To generate a list of candidate genes within the vicinity of the IPN QTL, QTL-linked DNA sequences were aligned to four model fish genomes. This identified two QTL-orthologous regions in each of the species, and gene order within these regions was highly conserved across species. Analysis of gene expression patterns between IPN resistant and susceptible salmon in a viral challenge experiment revealed that the five most significantly differentially-expressed genes mapped to the QTL-orthologous region on linkage group II of stickleback. Pathway enrichment analysis across all differentially-expressed genes suggests that biological pathways influencing viral infection stress response/entry/replication, cellular energy production and apoptosis may be involved in resistance during the initial stages of IPN virus (IPNV) infection. These results have provided the basis for further study of the putative involvement of these candidate genes and pathways in genetic resistance to IPNV. In summary, the results and resources presented in this thesis extend our current understanding of the salmon genome and the genetic basis of resistance to two viral diseases, and provide resources with the potential to be used in Atlantic salmon selective breeding programs to tackle disease outbreaks.
|
506 |
Integrative analysis of complex genomic and epigenomic mapsSharma, Supriya 20 February 2018 (has links)
Modern healthcare research demands collaboration across disciplines to build preventive measures and innovate predictive capabilities for curing diseases. Along with the emergence of cutting-edge computational and statistical methodologies, data generation and analysis has become cheaper in the last ten years. However, the complexity of big data due to its variety, volume, and velocity creates new challenges for biologists, physicians, bioinformaticians, statisticians, and computer scientists. Combining data from complex multiple profiles is useful to better understand cellular functions and pathways that regulates cell function to provide insights that could not have been obtained using the individual profiles alone. However, current normalization and artifact correction methods are platform and data type specific, and may require both the training and test sets for any application (e.g. biomarker development). This often leads to over-fitting and reduces the reproducibility of genomic findings across studies. In addition, many bias correction and integration approaches require renormalization or reanalysis if additional samples are later introduced. The motivation behind this research was to develop and evaluate strategies for addressing data integration issues across data types and profiling platforms, which should improve healthcare-informatics research and its application in personalized medicine. We have demonstrated a comprehensive and coordinated framework for data standardization across tissue types and profiling platforms. This allows easy integration of data from multiple data generating consortiums. The main goal of this research was to identify regions of genetic-epigenetic co-ordination that are independent of tissue type and consistent across epigenomics profiling data platforms. We developed multi-‘omic’ therapeutic biomarkers for epigenetic drug efficacy by combining our biomarker regions with drug perturbation data generated in our previous studies. We used an adaptive Bayesian factor analysis approach to develop biomarkers for multiple HDACs simultaneously, allowing for predictions of comparative efficacy between the drugs. We showed that this approach leads to different predictions across breast cancer subtypes compared to profiling the drugs separately. We extended this approach on patient samples from multiple public data resources containing epigenetic profiling data from cancer and normal tissues (The Cancer Genome Atlas, TCGA; NIH Roadmap epigenomics data).
|
507 |
Characterization of smoking-associated transcriptomic alterations to the human bronchial epitheliumDuclos, Grant Edward 24 October 2018 (has links)
The human bronchial epithelium is composed of multiple, discrete cell types that cooperate to perform mucociliary clearance. While previous studies have shown that cigarette smoke can alter bronchial epithelial gene expression, the underlying effects of this exposure on specific cell types are not well understood. In this thesis, single-cell RNA sequencing was used to profile bronchial epithelial cells from six current smokers and six never smokers. Thirteen cell clusters were identified that were defined by expression of unique combinations of nineteen distinct gene sets. This clustering revealed that smoke exposure induced expression of a toxin metabolism program that specifically associated with ciliated cells. Extensive airway remodeling was also observed, in which smoking was associated with loss of club cells as well as goblet cell expansion and hyperplasia. Additionally, a previously uncharacterized CEACAM5+ KRT8+ epithelial subpopulation was identified in the airways of smokers. While it has been shown that most smoking-associated gene expression alterations can be reversed upon smoking cessation, a subset of these alterations persists in former smokers. The basal layer of the bronchial epithelium is comprised of a multipotent progenitor subpopulation. When abnormalities persist in the bronchial epithelium despite normal tissue turnover, the source of these abnormalities may be traced to this progenitor population and its program of differentiation. Therefore, basal cells were procured from three current smokers and three never smokers, differentiated in vitro, and profiled by RNA sequencing at eight time points spanning the differentiation procedure. Twenty-seven unique sets of co-expressed genes associated with differentiation were identified and functionally characterized, a subset of which were abnormally expressed in smoker cells. Robust expression of genes involved with the unfolded protein response was specifically detected in smoker basal cells. Additionally, a smoking-associated delay in the onset of expression of genes involved with ciliogenesis was observed. These data therefore indicate that smoking has long-term consequences on the differentiated state of the airway epithelium. Collectively, the observations outlined in this thesis demonstrate that smoking drives a complex landscape of alterations that affects the function and composition of the human bronchial epithelium. / 2020-10-24T00:00:00Z
|
508 |
Étude bioinformatique des génomes de Porphyromonas / Bioinformatic study of Porphyromonas genomesAcuña Amador, Luis Alberto 20 December 2017 (has links)
Les bactéries du phylum Bacteroidetes, classe Bacteroidia, sont parmi les plus importantes dans microbiotes gastrointestinaux des humains et d'autres mammifères. La bouche, entrée du tube digestif, est un environnement avec des sites anatomiques variés, auxquels s'associent des microbiotes de composition différente. L'union de la gencive et des dents, le sillon gingivo-dentaire ou sulcus, est un site de dépôt d'un biofilm complexe appelé plaque dentaire. Une bactérie de ce phylum, Porphyromonas gingivalis, est capable de perturber le système immunitaire humain et de produire un déséquilibre du biofilm oral également nommée dysbiose. Ceci déclenche la formation de la poche parodontale, un creusement pathologique du sulcus, et l'apparition de la parodontite. D’autres espèces du genre Porphyromonas sont également associées à la parodontite notamment chez les canidés. Les populations de P. gingivalis sont panmictiques et la plasticité de leurs génomes importante. La bioinformatique peut aider à identifier les causes de la mosaïcité des génomes de cette bactérie, à étudier les facteurs de virulence au niveau du genre bactérien pour expliquer l'existence d'espèces pathogènes et d'autres commensales et à décrire la dysbiose liée à la parodontite. La génomique comparative de P. gingivalis a démontré une corrélation entre le nombre de contigs dans les génomes draft de cette espèce et les répétitions génomiques, notamment des séquences d'insertion. Nous avons re-séquencé, re-assemblé et re-annoté trois souches de référence de cette bactérie qui avaient des génomes complets, en utilisant un séquençage en long-read. Nous avons mis en évidence des erreurs d'assemblage sur les trois génomes publiés, que nous avons corrigé. Une étude du pangénome de ces trois souches montre un génome core important. La plasticité de l'espèce serait donc plus dans l'organisation du génome que dans les différentes capacités de codage. Une sous partie du génome core, dont les gènes ont un pourcentage d'identité nucléotidique plus faible que la plupart (génome core variant) est intéressante pour expliquer les différences phénotypiques de ces bactéries. Nous avons étudié la répartition d'un facteur de virulence, les fimbriae, structures d'adhésion, au sein du genre Porphyromonas et lié les loci à la phylogénie et au caractère pathogène des espèces. Finalement, une description de la dysbiose qui a lieu lors d'une parodontite est faite par une analyse du microbiote de patients atteints de parodontite et d'individus sains. Les genres prépondérants lors des deux états sont mis en évidence. Au cours de ces travaux, nous montrons l'importance de la biocuration et sa valeur ajoutée dans les travaux de génomique et bioinformatique en général. Seulement en faisant ce travail lent et lourd de biocuration, les réponses apportées aux questions biologiques seront pertinentes. / Bacteria of Bacteroidetes phylum, Bacteroidia class, are amongst the more important in gastrointestimal microbiota, either human or from other mammals. The mouth, digestive tube entry, is an environment with varied anatomic sites, each having a particular microbiota with different composition. The union between gingiva and teeth, the gingival sulcus, is a site for biofilm (dental plaque) formation and accumulation. Porphyromonas gingivalis, a bacterium from this phylum, can modulate the inmune system and produce an oral biofilm desequilibrium called dysbiosis. This triggers the formation of a periodontal pocket, a pathological deepening of the gingival sulcus, and the emergence of periodontitis. Other Porphyromonas species are also associated to periodontitis, mainly in canids. P. gingivalis populations are panmictic and their genomes are highly plastic. Bioinformatics can help to identify the causes of this genomic mosaicity, to study Porphyromonas virulence factors in order to explain why some species are pathogens and other are commensal, and to describe the dysbiosis linked to periodontitis. P. gingivalis comparative genomics showed a correlation between the number of contigs in draft genomes and genomic repeats, mainly insertion sequences. We resequenced, reassembled and reannotated three reference strains of this bacterium that already had complete published genomes, using long-read sequencing. We showed that misassemblies were present in the three published genomes, and we corrected them. A pangenome study of the three strains showed that the core genome is preponderant. The species plasticity might be related more to the genome organization than to different coding capacities. A subpart of th core genome, with genes having a nucleotidic identity percentage lower than the majority (variable core genome), is interesting for explaining the phenotypic differences of bacteria. We analysed the repertoire of a virulence factor, fimbriae, adhesion structures, in the Porphyromonas genus to link the loci to phylogeny and pathogenicity of its species. Finally, we described the dysbiosis occuring with periodontitis, analysing gingival microbiota of patients having the illness and healthy individuals. Preponderant genera in both states are highlighted. With this work, we demonstrate the importance of biocuration and its added value for genomic and bioinformatic studies in general. Only with this slow and arduous work, the answers to biological questions will be relevant.
|
509 |
Genomic insights into the human population history of Australia and New GuineaBergström, Anders January 2018 (has links)
The ancient continent of Sahul, encompassing Australia, New Guinea and Tasmania, contains some of the earliest archaeological evidence for humans outside of Africa, dating back to at least 50 thousand years ago (kya). New Guinea was also one of the sites were humans developed agriculture in the last 10 thousand years. Despite the importance of this part of the world to the history of humanity outside Africa, little is known about the population history of the people living here. In this thesis I present population-genetic studies using whole-genome sequencing and genotype array datasets from more than 500 indigenous individuals from Australia and New Guinea, as well as initial work on large-scale sequencing of other, worldwide, human populations in the Human Genome Diversity Project panel. Other than recent admixture after European colonization of Australia, and Southeast Asian ad- mixture in the lowlands of New Guinea in the last few millennia, the populations of Sahul appear to have been genetically independent from the rest of the world since their divergence ∼50 kya. There is no evidence for South Asian gene flow to Australia, as previously suggested, and the highlands of Papua New Guinea (PNG) have remained unaffected by non-New Guinean gene flow until the present day. Despite Sahul being a single connected landmass until ∼8 kya, different groups across Australia are nearly equally related to Papuans, and vice versa, and the two appear to have separated genetically already ∼30 kya. In PNG, all highlanders strikingly appear to form a clade relative to lowlanders, and population structure seems to have been reshaped, with major population size increases, on the same timescale as the spread of agriculture. However, present- day genetic differentiation between groups is much stronger in PNG than in other parts of the world that have also transitioned to agriculture, demonstrating that such a lifestyle change does not necessarily lead to genetic homogenization. The results presented here provide detailed insights into the population history of Sahul, and sug- gests that its history can serve as an independent source of evidence for understanding human evolutionary trajectories, including the relationships between genetics, lifestyle, languages and culture.
|
510 |
Synthesising executable gene regulatory networks in haematopoiesis from single-cell gene expression dataWoodhouse, Steven January 2017 (has links)
A fundamental challenge in biology is to understand the complex gene regulatory networks which control tissue development in the mammalian embryo, and maintain homoeostasis in the adult. The cell fate decisions underlying these processes are ultimately made at the level of individual cells. Recent experimental advances in biology allow researchers to obtain gene expression profiles at single-cell resolution over thousands of cells at once. These single-cell measurements provide snapshots of the states of the cells that make up a tissue, instead of the population-level averages provided by conventional high-throughput experiments. The aim of this PhD was to investigate the possibility of using this new high resolution data to reconstruct mechanistic computational models of gene regulatory networks. In this thesis I introduce the idea of viewing single-cell gene expression profiles as states of an asynchronous Boolean network, and frame model inference as the problem of reconstructing a Boolean network from its state space. I then give a scalable algorithm to solve this synthesis problem. In order to achieve scalability, this algorithm works in a modular way, treating different aspects of a graph data structure separately before encoding the search for logical rules as Boolean satisfiability problems to be dispatched to a SAT solver. Together with experimental collaborators, I applied this method to understanding the process of early blood development in the embryo, which is poorly understood due to the small number of cells present at this stage. The emergence of blood from Flk1+ mesoderm was studied by single cell expression analysis of 3934 cells at four sequential developmental time points. A mechanistic model recapitulating blood development was reconstructed from this data set, which was consistent with known biology and the bifurcation of blood and endothelium. Several model predictions were validated experimentally, demonstrating that HoxB4 and Sox17 directly regulate the haematopoietic factor Erg, and that Sox7 blocks primitive erythroid development. A general-purpose graphical tool was then developed based on this algorithm, which can be used by biological researchers as new single-cell data sets become available. This tool can deploy computations to the cloud in order to scale up larger high-throughput data sets. The results in this thesis demonstrate that single-cell analysis of a developing organ coupled with computational approaches can reveal the gene regulatory networks that underpin organogenesis. Rapid technological advances in our ability to perform single-cell profiling suggest that my tool will be applicable to other organ systems and may inform the development of improved cellular programming strategies.
|
Page generated in 0.0821 seconds