Global ETD Search

1	Wheat variety identification using genetic variations Synnergren, Jane January 2003 (has links) <p>There is a continuous development of different crop varieties in the crop trade. The cultivated crops tend to be more and more alike which require an effective method for crop identification. Crop type and crop type purity has become a quality measure in crop trade both nationally and internationally. A number of well known quality attributes of interest in the crop trade can be correlated to the specific crop type and therefore it is of great importance to reliably be able to identify different crop varieties. It is well known from the literature that there exist genomic variations at the nucleotide level between different crop varieties and these variations might potentially be useful for automated variety identification.</p><p>This project deals with the crop variety identification area where the possibilities of distinguishing between different wheat varieties are investigated. Experience from performing wheat variety identification at protein level has shown unsatisfactory results and therefore DNA-based techniques are proposed instead. DNA-based techniques are dependent upon the availability of sequence data from the wheat genome and some work has concerned examining the availability of sequence data from wheat. But the focus of the work has been on defining a method for computational detection of single nucleotide variations in ESTs from wheat and to experimentally test that method. Results from these experiments show that the method defined in this project detects polymorphic variations that can be correlated to variety variations</p> single nucleotide polymorphism (SNP) wheat variety identification clustering alignment Computer science Datavetenskap
2	Wheat variety identification using genetic variations Synnergren, Jane January 2003 (has links) There is a continuous development of different crop varieties in the crop trade. The cultivated crops tend to be more and more alike which require an effective method for crop identification. Crop type and crop type purity has become a quality measure in crop trade both nationally and internationally. A number of well known quality attributes of interest in the crop trade can be correlated to the specific crop type and therefore it is of great importance to reliably be able to identify different crop varieties. It is well known from the literature that there exist genomic variations at the nucleotide level between different crop varieties and these variations might potentially be useful for automated variety identification. This project deals with the crop variety identification area where the possibilities of distinguishing between different wheat varieties are investigated. Experience from performing wheat variety identification at protein level has shown unsatisfactory results and therefore DNA-based techniques are proposed instead. DNA-based techniques are dependent upon the availability of sequence data from the wheat genome and some work has concerned examining the availability of sequence data from wheat. But the focus of the work has been on defining a method for computational detection of single nucleotide variations in ESTs from wheat and to experimentally test that method. Results from these experiments show that the method defined in this project detects polymorphic variations that can be correlated to variety variations single nucleotide polymorphism (SNP) wheat variety identification clustering alignment Computer Sciences Datavetenskap (datalogi)
3	Sequence specific probe signals on SNP microarrays Glomb, Torsten 20 October 2017 (has links) Single nucleotide polymorphism (SNP) arrays are important tools widely used for genotyping and copy number estimation. This technology utilizes the specific affinity of fragmented DNA for binding to surface-attached oligonucleotide DNA probes. This thesis contemplates the variability of the probe signals of Affymetrix GeneChip SNP arrays as a function of the probe sequence to identify relevant sequence motifs which potentially cause systematic biases of genotyping and copy number estimates. info:eu-repo/classification/ddc/000 ddc:000
4	Understanding the Relationship Between HERC2 and OCA2 Variants and Iris Pigmentation Genetics Wallpe, Clarissa 08 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Externally visible characteristics (EVCs) predicted from an unknown sample of DNA are particularly useful in forensics as they can provide information beyond that of an STR profile. Current EVCs which are highly studied and well-predicted include iris, hair, and skin color. Notably, models predicting iris color, such as IrisPlex, are the most accurate with up to ~95% accuracy; however, some inaccurate predictions occur, as is evidenced by the ~5%. Often, these are due to green or hazel eyes, which are frequently viewed as intermediate. Though, some of the inaccurate predictions are due to true-blue being predicted as brown and vice versa. Previous research has theorized the possibility of two SNPs, rs12913832 and rs1800407, acting as a functional haplotype affecting iris color. rs12913832 is recognized as the most predictive SNP for iris color and highly significant in other pigmentation phenotypes; presently, rs1800407 is the second-ranked SNP in the IrisPlex 6-SNP system. Both SNPs are highly variable in Europe, where the majority of variation in iris color originates. In the present study, we explore the SNP variation present in the genetic regions of OCA2-HERC2 as well as possible haplotypes. Our research centers around the functional haplotype and the addition of SNPs to the functional haplotype. In addition, three different ways of classifying the phenotype are assessed simultaneously. First, using a 4-point categorical phenotype—blue/blue grey, blue/green yellow, hazel/light brown, and dark brown. Second, calculating a continuous scale from a quantitative phenotype in which the percentage of each categorical color has been measured. Third, using the IrisPlex 6-SNP system to predict eye color and identify individuals which have been inaccurately predicted. Exploration of the SNP and haplotype variation resulted in two SNPs for both the categorical and quantitative phenotypes which were significantly correlated with hazel/light brown—rs1448484 and rs61335644, both as independent SNPs and when assessed in a haplotype with rs1800407-rs12913832. SNP rs1448484 has been associated with skin pigmentation previously and is located in a possible transcription factor binding site. SNP rs61335644 is not presently associated with pigmentation but is in complete LD with two SNPs in and around regulatory regions present in HERC2. Finally, the addition of rs1448484 and rs61335644 into the current IrisPlex 6-SNP system slightly improved each of the tested performance metrics for hazel/light brown and dark brown. Within the inaccurately predicted phenotypes, rs1800407 is confirmed to affect both inaccurately predicted groups and is the most significant SNP. Additionally, rs121918166, a missense variant in OCA2, is the second most significant SNP in true blue predicted as brown. Both SNPs were also the two most significant haplotypes with at least one allele being derived. Therefore, the next steps should include the addition of the functional haplotype and rs121918166 into the current IrisPlex model, and further testing of rs1448484 and rs61335644 on a molecular level. Consequently, the current IrisPlex model should also be reassessed on an independent test set using the 4-point categorical scale rather than the present 3-point scale. Externally Visible Characteristics (EVC) Haplotype Single Nucleotide Polymorphism (SNP) Eye Color Prediction
5	IDH1/2 (isocitrate dehydrogenase 1/2) Mutations in Gliomas : genotype-Phenotype Correlation, Prognostic impact, and Response to Irradiation / Les mutations IDH1/2 (isocitrate déshydrogénase 1/2) dans les gliomes : Corrélation au profile génomique, facteur pronostique, et implication dans la réponse à l’irradiation Wang, Xiao Wei 26 July 2012 (has links) Depuis que Parsons et col. ont découvert en 2008 que le gène de l’isocitrate déhydrogénase 1 (IDH1) est fréquemment muté dans les glioblastomes (12%), de nombreuses équipes ont étudié la prévalence et les caractéristiques des mutations des gènes IDH1 et 2 dans les gliomes.Les mutations du gène IDH1 sont observées dans environ 40% des gliomes. La mutation d’IDH1 la plus fréquentes dans les gliomes (>90% des cas) est la mutation R132H. La fréquence des mutations IDH1 et 2 est inversement corrélée au grade des gliomes (grade II ~80%, III ~50%, and IV ~10%). Les mutations IDH1/2 ont une valeur diagnostique ainsi que pronostique (associées à une meilleure survie). Pendant ce travail de thèse nous avons dans une première partie analysé la distribution de ces mutations IDH1/2 dans les différents gliomes, leur association avec d’autres altérations génétiques, ainsi que leur valeur diagnostique et pronostique dans une cohorte de 1332 patients atteints de gliomes. Nous confirmons sur cette très grande cohorte les données de la littérature et affinons la valeur pronostique des mutations IDH1/2. Dans une seconde partie, nous avons mis en évidence dans les gliomes un polymorphisme (SNP) du gène IDH1 (SNP rs 11554137; C (cytosine) substituted by T (thymin)) précédemment observé dans les leucémies myéloïdes aigues. Ce SNP, codon 105, est localisé dans le même exon que le codon 132 fréquemment muté, et nous avons montré qu’il est associé à une moins bonne survie des patients atteints de gliomes. Les mutations du codon 132 causent une baisse de l’activité enzymatique normale d’IDH1/2 qui est remplacé par le gain d’une nouvelle. Les protéines IDH1/2 mutés, au lieu de produire de l’alpha-cétoglutarate de façon NADP dépendante, réduisent de façon NADPH dépendante l’alpha-cétoglutarate en 2-hydroxyglutarate (2HG). Une forte concentration de 2HG et une baisse de la quantité de NADPH peuvent sensibiliser les tumeurs au stress oxidatif et donc potentialiser l’effet de la radiothérapie, ce qui pourrait expliquer la meilleure survie de ces patients. Nous avons donc dans une troisième partie étudié in vitro l’impact de la mutation IDH1R132H sur la survie après radiothérapie de cellules tumorales exprimant de façon stable ce gène muté. Les résultats obtenus montrent que dans certaines conditions ces cellules pourraient être plus radiosensibles que les mêmes cellules exprimant le gène IDH1 non-muté.Dans ce travail de thèse, nous avons donc étudié le gène IDH1 dans les gliomes de patients et tenté par une approche fonctionnelle in vitro d’évaluer l’impact de la mutation IDH1R132H sur la radiosensibilité des cellules tumorales. / Since Parsons et al. (2008) found the frequent mutations of IDH1 (12%) in GBMs, various reports have studied the prevalence and characteristic of IDH1 and IDH2 mutations.The mutations in the isocitrate dehydrogenase 1 (IDH1) gene occur in nearly 40% of gliomas. The frequency of IDH1 mutations are inversely connected with grade II (~80%), III (~50%), and IV (~ 10%) gliomas. Importantly, the status of IDH1 mutations is associated with a better outcome and demonstrated a diagnostic value. We analyzed also these mutations in distribution, association with tumor-derived other genetic alterations and the diagnostic and prognostic value in a cohort of 1332 glioma patients.A synonymous single nucleotide polymorphism [SNP rs 11554137; C (cytosine) substituted by T (thymin)] has been studied in gliomas patients. The SNP rs 11554137 (in codon 105) are located in the same exon with the IDH1 R132 mutations (in codon 132). And gliomas patients with SNP rs 11554137: C>T had a poorer outcome than patients without SNP rs 11554137. This was observed a similarly adverse effect in survival in patients with AML. Mutations in codon 132 can cause a decrease of IDH1/2 activity and also gain a new enzyme function for the NADPH dependent reduction of alpha-ketoglutarate to 2-hydroxyglutarate. High 2HG and low NADPH levels might sensitize tumors to oxidative stress, potentiating response to radiotherapy, and may account for the prolonged survival of patients harboring the mutations. So we studied further the alterations of function in IDH1R132H mutant cells in vitro. Based on the decrease of defence and the increase of impairing factors in tumor cells, we found that the tumors harbouring IDH1 mutations may have an elevated radiosensitivity. In the present study, we described the impact of IDH1 mutations in gliomas and search for new perspectives for the treatment strategy. Gliomes Isocitrate déshydrogénase 1 (IDH1) Polymorphism nucléotide (SNP) Radiothérapie Radiosensibilité Gliomas Isocitrate dehydrogenase 1 (IDH1) Single nucleotide polymorphism (SNP) Irradiation
6	Genetic and Phylogenetic Studies of Toll-Like Receptor 5 (TLR5) in River Buffalo (Bubalus Bubalis) Jones, Brittany 14 March 2013 (has links) River buffalo are economically important to many countries and only recently has their genome been explored for the purpose of mapping genetic variation in traits of economic and biologic interest. The purpose of this research is to characterize the genetic and evolutionary profile of Toll-like receptor 5 (TLR5), which mediates the mammalian innate immune response to bacterial flagellin. This study is comprised of three parts: 1) generating a radiation hybrid (RH) map of river buffalo chromosome 5 (BBU5) where the TLR5 gene is located and building a comparative map with homologous cattle chromosomes; 2) conducting a single-nucleotide polymorphism (SNP) survey of the TLR5 gene to reveal variation within river buffalo and other species; and 3) performing an evolutionary study by inferring phylogenetic trees of TLR5 across multiple taxa and determining the possible evolutionary constraints within the TLR5 coding region. River buffalo chromosome 5 is a bi-armed chromosome with arms corresponding to cattle chromosomes 16 and 29. A BBU5 RH map was developed using the previously published river buffalo RH mapping panel and cattle-derived markers. The RH map developed in this study became an integral part of the first river buffalo whole genome RH map. Genetic variation of the TLR5 gene was evaluated in a small domestic herd of river buffalo. Sequencing of the TLR5 coding region and partial associated 5'- and 3'-untranslated regions yielded 16 novel SNPs. Six SNPs were identified as non-synonymous with one predicted to potentially code for a functionally altered product. For the evolutionary study of the TLR5 coding region, phylogenetic trees were inferred based on TLR5 variation across multiple orders and another for artiodactyla. Species that are closely related to river buffalo appear to have undergone negative selection in TLR5 while those that diverged from river buffalo earlier may be retaining alleles that river buffalo are removing from the population. In conclusion, putative chromosomal rearrangements were identified between river buffalo and cattle, the variation that was uncovered in the TLR5 coding region could potentially lead to differential immunity across species, and there appears be some evolutionary flexibility in the DNA sequence of the TLR5 coding region. radiation hybrid (RH) mapping phylogenetics single-nucleotide polymorphism (SNP) toll-like receptor 5 (TLR5) innate immunity river buffalo
7	Computerised methods for selecting a small number of single nucleotide polymorphisms that enable bacterial strain discrimination Robertson, Gail Alexandra January 2006 (has links) The possibility of identifying single nucleotide polymorphisms (SNPs) that would be useful for rapid bacterial typing was investigated. Neisseria meningitidis was the organism chosen for modelling the approach since informative SNPs could be found amongst the sequence data available for multi-locus sequence typing (MLST) at http://www.mlst.net. The hypothesis tested was that a small number of SNPs located within the seven gene fragments sequenced for MLST provide information equivalent to MLST. Preliminary investigations revealed that a small number of SNPs could be utilised to highly discriminate sequence types (STs) of clinical interest. Laboratory procedures demonstrated that SNP fingerprinting of N. meningitidis isolates is achievable. Further tests showed that laboratory identification of a defining SNP in the genome of isolates was to be a practical method of obtaining relevant typing information. Identification of the most discriminating SNPs amongst the ever-increasing amount of MLST sequence data summoned the need for computer-based assistance. Two methods of SNP selection devised by the author of this thesis were translated into computer-based algorithms by contributing team members. Software for two computer programs was produced. The algorithms facilitate the optimal selection of SNPs useful for (1) distinguishing specific STs and (2) differentiating non-specific STs. Current input information can be obtained from the MLST database and consequently the programs can be applied to any bacterial species for which MLST data have been entered. The two algorithms for the selection of SNPs were designed to serve contrasting purposes. The first of these was to determine the ST identity of isolates from an outbreak of disease. In this case, isolates would be tested for their membership to any of the STs known to be associated with disease. It was shown that one SNP per ST could distinguish each of four hyperinvasive STs of N. meningitidis from between 92.5% and 97.5% of all other STs. With two SNPs per ST, between 96.7% and 99.0% discrimination is achieved. The SNPs were selected from MLST loci with the assistance of the first algorithm which scores SNPs according to the number of base mismatches in a sequence alignment between an allele of an ST of interest and alleles belonging to all other STs at a specified locus. The second purpose was to determine whether or not isolates from different sources belong to the same ST, regardless of their actual ST identity. It was shown that with seven SNPs, four sample STs of N. meningitidis could, on average, be discriminated from 97.1% of all other STs. The SNPs were selected with the aid of the second algorithm which scores SNPs at MLST loci for the relative frequency of each nucleotide base in a sequence alignment as a measure of the extent of their polymorphism. A third algorithm for selecting SNPs has been discussed. By altering the method of scoring SNPs, it is possible to overcome the limitations inherent in the two algorithms that were utilised for finding SNPs. In addition, the third approach caters for finding SNPs that distinguish members of a complex from non-members. bacterial typing single nucleotide polymorphism (SNP) multilocus sequence typing (MLST) sequence type (ST) neisseria meningitidis discrimination Simpsons index of diversity
8	Stock improvement of giant freshwater prawn (Macrobrachium rosenbergii) in Vietnam: Experimental evaluations of crossbreeding,the impact of domestication on genetic diversity and candidate genes Thanh Nguyen Unknown Date (has links) Aquaculture plays an important role in economic development and food security in many countries in the world. World aquaculture production in 2006 was 51.7 million tonnes with an estimated value of US$ 78.8 billion (FAO, 2009). World production will need to increase however by 30-40 million tonnes from its current production level by 2030 to meet growing global demand for fish. In this context, aquaculture in Vietnam has developed rapidly over the past decade and the fisheries sector ranked fourth in terms of export value in 2008 (Vietnamnet, 2008). Total fisheries production in Vietnam in 2007 was 4.149 million tonnes, of which fisheries production from catch and aquaculture were 2.064 and 2.085 million tonnes, respectively. A variety of aquatic species are cultured in Vietnam, but shrimps (mainly Black Tiger shrimp Penaeus monodon, and Pacific white shrimp Litopenaeus vannamei) and ‘tra’ or ‘basa’ catfish are the most common species used in aquaculture. The giant freshwater prawn (GFP), Macrobrachium rosenbergii, is one of the most important crustacean species in inland aquaculture in many countries across the world where this species is either native or exotic. GFP is suitable for culture in a variety of farming systems, including monoculture or polyculture in ponds, pens, and integrated or rotational rice-prawn culture models. The GFP industry worldwide relies totally on wild or unimproved stocks, a practice that threatens the long-term sustainability of GFP farming due to low productivity and vulnerability of farmed stocks to disease. The current status of GFP aquaculture highlights the need for initiation of a systematic stock improvement program for the species to improve economically important traits. Large-scale selective breeding programs have been instigated for some finfish, salmonids and GIFT tilapia for example, and some selective breeding trials have been conducted on crustacean species, namely marine penaeid shrimp and freshwater crayfish. Examples of selective breeding programs on aquatic species have demonstrated that significant genetic gains can be achieved for growth rates with gains of around 10-20% per generation. While a selective breeding program is an option for GFP stock improvement, an alternative approach to improving GFP productivity, potentially with more immediate effect and one that is less expensive, is crossbreeding which may produce heterosis or hybrid vigour in crossbred offspring. Therefore, a crossbreeding strategy was trialed in the current study as a starting point for a stock improvement program for the GFP industry in Vietnam. The current study assessed the growth performance of three GFP strains (two wild Vietnamese strains from the Dong Nai and Mekong rivers, and a single domesticated Hawaiian strain) and their reciprocal crosses in a complete 3x3 diallel cross, i.e. three purebred and six crossbred strains. The diallel cross was carried out over two consecutive generations (G1 and G2). Juveniles for the experiments were produced using single-pair matings. Juveniles from each strain combination were stocked into three replicate hapas for 15 weeks. Growth data (body weight, carapace length, standard length) from the G1 and G2 were pooled for all subsequent analyses as there was no effect of generation on growth traits. Results showed that the Hawaiian strain performed best among purebred strains, and crosses with the Dong Nai or Mekong strains as dams and the Hawaiian strain as sires grew significantly faster than did the purebred Dong Nai or Mekong strains. These results suggest potential for heterosis among some crosses. Growth data were analyzed in depth by partitioning the strain combination (cross) effect into three components: strain additive genetic effects, heterotic effects, and strain reciprocal effects. Strain additive genetic and reciprocal effects were significant sources of variation for all growth traits measured. Strain additive genetic effects were highest for the Hawaiian strain and lowest for the Mekong strain for all growth traits. Reciprocal effects influenced negatively on growth rate of crosses with the Hawaiian (H) strain as dams and the Dong Nai (D) or Mekong (M) as sires compared with their reciprocal crosses (DH and MH). Heterotic effects for all growth traits were small and not significantly different from zero (P > 0.05). These results indicate that a crossbreeding approach based on the strains evaluated here provides only limited potential for improving growth rates based simply on heterotic outcomes and that a likely more productive option would be to trial artificial selection on a diverse synthetic stock. The current study also employed genetic markers (microsatellites) to characterize levels and patterns of genetic diversity in three purebred strains of GFP that originated from the diallel cross above. All three purebred strains showed relative high levels of genetic diversity in terms of allele number and individual heterozygosity across the six marker loci screened. Levels of genetic diversity present in the three purebred strains combined into a single stock were compared with that from a combination of three wild river stocks to assess the impact of domestication on genetic diversity of a ‘synthetic’ population. Results demonstrated that there was no significant loss of genetic diversity in the three purebred strains combined compared with a reference set containing the three wild populations. Therefore, a synthetic population formed from these purebred strains successfully captured the majority of genetic variation present in the wild broodstock. This synthetic population provides a potential stock for a future selective breeding program for GFP in Vietnam. The current study was also the first attempt to identify single nucleotide polymorphisms (SNPs) in key growth genes in GFP. Two key candidate genes were targeted, actin and crustacean hyperglycemic hormone (CHH), that are potentially linked to growth performance in GFP. The study screened SNPs in GFP females only, because growth performance of GFP males is influenced strongly by social rank. The study identified four SNPs in intron 3 of the CHH gene that were significantly correlated with individual body weight at harvest, while no SNPs detected in the actin gene were associated with growth traits in GFP. This finding however, needs to be confirmed using larger sample sizes and other GFP lines. The current study has produced important basic knowledge relevant to implementation of an effective stock improvement program for GFP in Vietnam. Results indicate that a selective breeding strategy rather than a crossbreeding approach is likely to be the best strategy for improving GFP culture stocks in Vietnam. In addition, the study demonstrates that application of modern molecular genetic technologies can be efficient in developing a genetically diverse, synthetic population for stock improvement and for identifying potential markers correlated with important commercial traits in GFP. Integration of DNA techniques with traditional breeding practices can facilitate GFP stock improvement in Vietnam and accelerate the industry development when improved lines are available. Some limitations of the current study and recommendations for further work are discussed.
9	Genetic Sequence Analysis by Microarray Technology Hultin, Emilie January 2007 (has links) Developments within the field of genetic analysis have during the last decade become enormous. Advances in DNA sequencing technology have increased throughput from a thousand bases to over a billion bases in a day and decreased the cost thousandfold per base. Nevertheless, to sequence complex genomes like the human is still very expensive and efforts to attain even higher throughputs for less money are undertaken by researchers and companies. Genotyping systems for single nucleotide polymorphism (SNP) analysis with whole genome coverage have also been developed, with low cost per SNP. There is, however, a need for genotyping assays that are more cost efficient per sample with considerably higher accuracy. This thesis is focusing on a technology, based on competitive allele-specific extension and microarray detection, for genetic analysis. To increase specificity in allele-specific extension (ASE), a nucleotide degrading enzyme, apyrase, was introduced to compete with the polymerase, only allowing the fast, perfect matched primer extension to occur. The aim was to develop a method for analysis of around twenty loci in hundreds of samples in a high-throughput microarray format. A genotyping method for human papillomavirus has been developed, based on a combination of multiplex competitive hybridization (MUCH) and apyrase-mediated allele-specific extension (AMASE). Human papillomavirus (HPV), which is the causative agent in cervical cancer, exists in over a hundred different types. These types need to be determined in clinical samples. The developed assay can detect the twenty-three most common high risk types, as well as semi-quantifying multiple infections, which was demonstrated by analysis of ninety-two HPV-positive clinical samples. More stringent conditions can be obtained by increased reaction temperature. To further improve the genotyping assay, a thermostable enzyme, protease, was introduced into the allele-specific extension reaction, denoted PrASE. Increased sensitivity was achieved with an automated magnetic system that facilitates washing. The PrASE genotyping of thirteen SNPs yielded higher conversion rates, as well as more robust genotype scoring, compared to ASE. Furthermore, a comparison with pyrosequencing, where 99.8 % of the 4,420 analyzed genotypes were in concordance, indicates high accuracy and robustness of the PrASE technology. Single cells have also been analyzed by the PrASE assay to investigate loss of alleles during skin differentiation. Single cell analysis is very demanding due to the limited amounts of DNA. The multiplex PCR and the PrASE assay were optimized for single cell analysis. Twenty-four SNPs were genotyped and an increased loss of genetic material was seen in cells from the more differentiated suprabasal layers compared to the basal layer. / QC 20100714 Genotyping single nucleotide polymorphism (SNP) microarray tag-array competitive hybridization human papillomavirus (HPV) single cell loss of alleles differentiation epidermis. Bioengineering Bioteknik
10	Bayesian Model Selection for High-dimensional High-throughput Data Joshi, Adarsh 2010 May 1900 (has links) Bayesian methods are often criticized on the grounds of subjectivity. Furthermore, misspecified priors can have a deleterious effect on Bayesian inference. Noting that model selection is effectively a test of many hypotheses, Dr. Valen E. Johnson sought to eliminate the need of prior specification by computing Bayes' factors from frequentist test statistics. In his pioneering work that was published in the year 2005, Dr. Johnson proposed using so-called local priors for computing Bayes? factors from test statistics. Dr. Johnson and Dr. Jianhua Hu used Bayes' factors for model selection in a linear model setting. In an independent work, Dr. Johnson and another colleage, David Rossell, investigated two families of non-local priors for testing the regression parameter in a linear model setting. These non-local priors enable greater separation between the theories of null and alternative hypotheses. In this dissertation, I extend model selection based on Bayes' factors and use nonlocal priors to define Bayes' factors based on test statistics. With these priors, I have been able to reduce the problem of prior specification to setting to just one scaling parameter. That scaling parameter can be easily set, for example, on the basis of frequentist operating characteristics of the corresponding Bayes' factors. Furthermore, the loss of information by basing a Bayes' factors on a test statistic is minimal. Along with Dr. Johnson and Dr. Hu, I used the Bayes' factors based on the likelihood ratio statistic to develop a method for clustering gene expression data. This method has performed well in both simulated examples and real datasets. An outline of that work is also included in this dissertation. Further, I extend the clustering model to a subclass of the decomposable graphical model class, which is more appropriate for genotype data sets, such as single-nucleotide polymorphism (SNP) data. Efficient FORTRAN programming has enabled me to apply the methodology to hundreds of nodes. For problems that produce computationally harder probability landscapes, I propose a modification of the Markov chain Monte Carlo algorithm to extract information regarding the important network structures in the data. This modified algorithm performs well in inferring complex network structures. I use this method to develop a prediction model for disease based on SNP data. My method performs well in cross-validation studies. Bayes factors Bayes factors based on test statistics Bayesian Graphs MCMC Objective Bayesian Analysis Bayesian Model Selection Microarray data

Search results