Spelling suggestions: "subject:"package"" "subject:"repackage""
11 |
Diversité des composés terpéniques volatils au sein du genre Lavandula : aspects évolutifs et physiologiquesGuitton, Yann 21 December 2010 (has links) (PDF)
La production de lavande concoure au rayonnement de la région Rhône-Alpes. Les applications de l'huile essentielle (HE) de lavande reposent sur la culture de 3 espèces (L. aangustifolia, L. latifolia et L. stoechas et d'un hybride L. x intermedia) aux chémotypes marqués. Le genre Lavandula est un modèle idéal pour comprendre la structuration et l'origine de la diversité des composés organiques volatils (COV) en particulier des terpènes. Les lavandes ont l'avantage d'avoir une aire de distribution large avec des régions bioclimatiques différentes, un nombre d'espèces limité (39) ayant des caractéristiques morphologiques et écologiques variées. Pour caractériser la diversité des COV accumulés dans les espèces du genre et envisager leur évolution, nous avons analysé (GC-MS) les COV de 29 espèces (certaines pour la première fois). Comme souvent chez les plantes, la production de COV dans les inflorescences de lavande est soumise à une régulation spatio-temporelle. L'émission différentielle de COV au cours du temps chez L. angustifolia a été relevée par les agriculteurs qui ont observé une qualité d'HE différente suivant la maturité des inflorescences au moment de la récolte. Pour modéliser ces variations et les corréler avec des étapes du développement de la plante, nous avons analysé, au niveau chimique (GC-FID) et moléculaire (qPCR), les variations temporelles des principaux COV dans les feuilles et les inflorescences (plusieurs années et cultivars). En amont de ces recherches sur les COV du genre Lavandula, différent outils de bioinformatique ont été développés. En particulier, le module " MSeasy " qui permet d'automatiser le rapatriement de données de GC-MS. Ceci constitue un pré-requis pour utiliser la lavande comme modèle d'étude des COV chez les Lamiacées
|
12 |
Diversité des composés terpéniques volatils au sein du genre Lavandula : aspects évolutifs et physiologiques / Diversity of the volatile terpenic compounds within the genus Lavandula : evolutary and physiological aspectsGuitton, Yann 21 December 2010 (has links)
La production de lavande concoure au rayonnement de la région Rhône-Alpes. Les applications de l’huile essentielle (HE) de lavande reposent sur la culture de 3 espèces (L. aangustifolia, L. latifolia et L. stoechas et d’un hybride L. x intermedia) aux chémotypes marqués. Le genre Lavandula est un modèle idéal pour comprendre la structuration et l’origine de la diversité des composés organiques volatils (COV) en particulier des terpènes. Les lavandes ont l’avantage d’avoir une aire de distribution large avec des régions bioclimatiques différentes, un nombre d’espèces limité (39) ayant des caractéristiques morphologiques et écologiques variées. Pour caractériser la diversité des COV accumulés dans les espèces du genre et envisager leur évolution, nous avons analysé (GC-MS) les COV de 29 espèces (certaines pour la première fois). Comme souvent chez les plantes, la production de COV dans les inflorescences de lavande est soumise à une régulation spatio-temporelle. L'émission différentielle de COV au cours du temps chez L. angustifolia a été relevée par les agriculteurs qui ont observé une qualité d’HE différente suivant la maturité des inflorescences au moment de la récolte. Pour modéliser ces variations et les corréler avec des étapes du développement de la plante, nous avons analysé, au niveau chimique (GC-FID) et moléculaire (qPCR), les variations temporelles des principaux COV dans les feuilles et les inflorescences (plusieurs années et cultivars). En amont de ces recherches sur les COV du genre Lavandula, différent outils de bioinformatique ont été développés. En particulier, le module « MSeasy » qui permet d’automatiser le rapatriement de données de GC-MS. Ceci constitue un pré-requis pour utiliser la lavande comme modèle d’étude des COV chez les Lamiacées / The lavender production is of significant importance for the international visibility of the french Rhône-Alpes region. Uses of lavender essential oil (EO) are based on the growing of 3 species (L. angustifolia, L. latifolia, L. stoechas and an hybride L. x intermedia) with marked. The genus Lavandula is an ideal model for understanding the origin of the diversity of volatile organic compounds (VOCs), especially terpenes. Lavenders have the advantage of having a wide range of distribution areas with different bioclimatic regions, a limited number of species (39) with diverse morphological and ecological caracteristics. In order to characterize the diversity of the VOCs accumulated in the genus and consider their evolution, we have analyzed (GC-MS) the VOCS accumulated by 29 species (some for the first time). As often, in plants, the production of VOCs in the inflorescences of lavender is subject to spatial and temporal regulation. The differential emission of VOCs over time in L. angustifolia is a well known phenomenon for farmers who have observed a different quality of EO depending on the maturity of the inflorescences at harvest. To correlate these variations with stages of plant development, we have analysed the temporal variations of the main VOCs in leaves and inflorescences (several years and cultivars) at the chemical level (GC-FID) and the molecular level (qPCR). Upstream of this research on the genus Lavandula different bioinformatic tools have been developed. In particular, the module “MSeasy " which can automate GC-MS data retrieval. This is a prerequisite for using lavender in the future as a model study of VOCS in Lamiaceae
|
13 |
Computation of High-Dimensional Multivariate Normal and Student-t Probabilities Based on Matrix Compression SchemesCao, Jian 22 April 2020 (has links)
The first half of the thesis focuses on the computation of high-dimensional multivariate normal (MVN) and multivariate Student-t (MVT) probabilities. Chapter 2 generalizes the bivariate conditioning method to a d-dimensional conditioning method and combines it with a hierarchical representation of the n × n covariance matrix. The resulting two-level hierarchical-block conditioning method requires Monte Carlo simulations to be performed only in d dimensions, with d ≪ n, and allows the dominant complexity term of the algorithm to be O(n log n). Chapter 3 improves the block reordering scheme from Chapter 2 and integrates it into the Quasi-Monte Carlo simulation under the tile-low-rank representation of the covariance matrix. Simulations up to dimension 65,536 suggest that this method can improve the run time by one order of magnitude compared with the hierarchical Monte Carlo method. The second half of the thesis discusses a novel matrix compression scheme with Kronecker products, an R package that implements the methods described in Chapter 3, and an application study with the probit Gaussian random field. Chapter 4 studies the potential of using the sum of Kronecker products (SKP) as a compressed covariance matrix representation. Experiments show that this new SKP representation can save the memory footprint by one order of magnitude compared with the hierarchical representation for covariance matrices from large grids and the Cholesky factorization in one million dimensions can be achieved within 600 seconds. In Chapter 5, an R package is introduced that implements the methods in Chapter 3 and show how the package improves the accuracy of the computed excursion sets. Chapter 6 derives the posterior properties of the probit Gaussian random field, based on which model selection and posterior prediction are performed. With the tlrmvnmvt package, the computation becomes feasible in tens of thousands of dimensions, where the prediction errors are significantly reduced.
|
14 |
The Linkage Disequilibrium LASSO for SNP Selection in Genetic Association StudiesYounkin, Samuel G. January 2011 (has links)
No description available.
|
15 |
Statistical methods for analyzing sequencing data with applications in modern biomedical analysis and personalized medicineManimaran, Solaiappan 13 March 2017 (has links)
There has been tremendous advancement in sequencing technologies; the rate at which sequencing data can be generated has increased multifold while the cost of sequencing continues on a downward descent. Sequencing data provide novel insights into the ecological environment of microbes as well as human health and disease status but challenge investigators with a variety of computational issues. This thesis focuses on three common problems in the analysis of high-throughput data. The goals of the first project are to (1) develop a statistical framework and a complete software pipeline for metagenomics that identifies microbes to the strain level and thus facilitating a personalized drug treatment targeting the strain; and (2) estimate the relative content of microbes in a sample as accurately and as quickly as possible.
The second project focuses on the analysis of the microbiome variation across multiple samples. Studying the variation of microbiomes under different conditions within an organism or environment is the key to diagnosing diseases and providing personalized treatments. The goals are to (1) identify various statistical diversity measures; (2) develop confidence regions for the relative abundance estimates; (3) perform multi-dimensional and differential expression analysis; and (4) develop a complete pipeline for multi-sample microbiome analysis.
The third project is focused on batch effect analysis. When analyzing high dimensional data, non-biological experimental variation or “batch effects” confound the true associations between the conditions of interest and the outcome variable. Batch effects exist even after normalization. Hence, unless the batch effects are identified and corrected, any attempts for downstream analyses, will likely be error prone and may lead to false positive results. The goals are to (1) analyze the effect of correlation of the batch adjusted data and develop new techniques to account for correlation in two step hypothesis testing approach; (2) develop a software pipeline to identify whether batch effects are present in the data and adjust for batch effects in a suitable way.
In summary, we developed software pipelines called PathoScope, PathoStat and BatchQC as part of these projects and validated our techniques using simulation and real data sets.
|
16 |
Expression and possible functions of circular RNAsGlazar, Petar 08 June 2020 (has links)
Circular RNAs (circRNAs) sind eine große Klasse endogener RNAs, die in Organismen vorkommen, die RNA-Transkripte durch Spleißen prozessieren. Sie sind Produkte des „backsplicing“ – einer Art des alternativen Spleißens, bei der das 3‘-Ende eines Exons mit einer vorgelagerten 5‘-„splice site“ verbunden wird. Trotz ihrer Abundanz und spezifischen Expressionsmustern sind in vivo-Funktionen von circRNAs größtenteils unbekannt.
Wir haben den existierenden Kenntnisstand systematisiert und diesen in Form von circBase frei zugänglich gemacht. circBase ist eine Online-Datenbank, in der circRNA-Datensätze abgerufen und im genomischen Kontext durchsucht und visualisiert werden können. Für die Arbeit mit Hochdurchsatz-circRNA-Daten haben wir des Weiteren die Software ciRcus entwickelt. Um mehr bezüglich circRNA-Expression und möglicher Funktionen zu lernen, haben wir die Expressionsmuster im Säugetiergehirn umfassend erforscht. Mithilfe von eigenen und öffentlich zugänglichen RNA-Sequenzierungsdaten haben wir Tausende von neuralen circRNAs in Mensch und Maus entdeckt. circRNAs waren während der neuronalen Differenzierung und Reifung insgesamt hochreguliert, stark angereichert in Synapsen, und oft differentiell exprimiert im Vergleich zu ihren mRNA-Isoformen. Außerdem haben wir gezeigt, dass viele circRNAs zwischen Mensch und Maus konserviert sind. Schließlich haben wir in vivo-Funktionen von Cdr1as erforscht - einer konservierten und im Gehirn hoch exprimierten circRNA, die stark von microRNA (miRNA)-Effektor-Komplexen gebunden ist und zahlreiche miR-7-Bindestellen sowie eine Bindestelle für miR-671 aufweist. „Knockout“-Tiere, bei denen der Cdr1as-Lokus deletiert wurde, zeigten ein gestörtes sensomotorisches „gating“ und dysfunktionale synaptische Übertragung. Die Expression von miR-7 und miR-671 war in verschiedenen Hirnregionen der Tiere dereguliert. Die Expression von „immediate early“-Genen, von denen einige miR-7-Zielgene sind, war erhöht. / circular RNAs (circRNAs) are a large class of endogenous RNAs present in organisms that process RNA transcripts by splicing. They are products of backsplicing - alternative splicing reactions where the 3’ end of an exon is spliced to an upstream 5’ splice site. Despite their abundance and tissue- and developmental-stage-specific expression patterns, their in vivo functions are largely unknown.
We systematized the existing knowledge on circRNAs and made it freely available by developing circBase - an online database where circRNA datasets can be accessed, downloaded and browsed within the genomic context. Another technical challenge was addressed by developing ciRcus - a software package for working with high-throughput circRNA data, which allowed us to routinely handle, explore, annotate, quantify and integrate circRNA data with the external sources of biological data. To learn more about circRNA expression and potential functions, we have explored the expression patterns of circRNAs in the mammalian brain. Using own and public RNA-seq data, we discovered thousands of neural circRNAs in human and mouse. circRNAs were upregulated during neuronal differentiation and maturation, enriched in synapses, and often differentially expressed compared to their host mRNAs. Many circRNAs were conserved between human and mouse. Finally, we explored in vivo functions of Cdr1as - a conserved circRNA known to be highly expressed in the brain, heavily bound by microRNA (miRNA) effector complexes, and harbouring many binding sites for miR-7, as well as a single binding site for miR-671. Upon deleting the Cdr1as locus, knockout animals displayed impaired sensorimotor gating and dysfunctional synaptic transmission. Expression of miR-7 and miR-671 was deregulated in different brain regions of Cdr1as knockout animals. Expression of immediate early genes, some of which are miR-7 targets, was increased, providing a possible molecular link to the behavioral phenotype.
|
17 |
Similarity Measures for Nominal Data in Hierarchical Clustering / Míry podobnosti pro nominální data v hierarchickém shlukováníŠulc, Zdeněk January 2013 (has links)
This dissertation thesis deals with similarity measures for nominal data in hierarchical clustering, which can cope with variables with more than two categories, and which aspire to replace the simple matching approach standardly used in this area. These similarity measures take into account additional characteristics of a dataset, such as frequency distribution of categories or number of categories of a given variable. The thesis recognizes three main aims. The first one is an examination and clustering performance evaluation of selected similarity measures for nominal data in hierarchical clustering of objects and variables. To achieve this goal, four experiments dealing both with the object and variable clustering were performed. They examine the clustering quality of the examined similarity measures for nominal data in comparison with the commonly used similarity measures using a binary transformation, and moreover, with several alternative methods for nominal data clustering. The comparison and evaluation are performed on real and generated datasets. Outputs of these experiments lead to knowledge, which similarity measures can generally be used, which ones perform well in a particular situation, and which ones are not recommended to use for an object or variable clustering. The second aim is to propose a theory-based similarity measure, evaluate its properties, and compare it with the other examined similarity measures. Based on this aim, two novel similarity measures, Variable Entropy and Variable Mutability are proposed; especially, the former one performs very well in datasets with a lower number of variables. The third aim of this thesis is to provide a convenient software implementation based on the examined similarity measures for nominal data, which covers the whole clustering process from a computation of a proximity matrix to evaluation of resulting clusters. This goal was also achieved by creating the nomclust package for the software R, which covers this issue, and which is freely available.
|
18 |
High Dimensional Financial Engineering: Dependence Modeling and Sequential SurveillanceXu, Yafei 07 February 2018 (has links)
Diese Dissertation konzentriert sich auf das hochdimensionale Financial Engineering, insbesondere in der Dependenzmodellierung und der sequentiellen Überwachung.
Im Bereich der Dependenzmodellierung wird eine Einführung hochdimensionaler Kopula vorgestellt, die sich auf den Stand der Forschung in Kopula konzentriert.
Eine komplexere Anwendung im Financial Engineering, bei der eine hochdimensionale Kopula verwendet wird, konzentriert sich auf die Bepreisung von Portfolio-ähnlichen Kreditderivaten, d. h. CDX-Tranchen (Credit Default Swap Index). In diesem Teil wird die konvexe Kombination von Kopulas in der CDX-Tranche mit Komponenten aus der elliptischen Kopula-Familie (Gaussian und Student-t), archimedischer Kopula-Familie (Frank, Gumbel, Clayton und Joe) und hierarchischer archimedischer Kopula-Familie vorgeschlagen.
Im Abschnitt über finanzielle Überwachung konzentriert sich das Kapitel auf die Überwachung von hochdimensionalen Portfolios (in den Dimensionen 5, 29 und 90) durch die Entwicklung eines nichtparametrischen multivariaten statistischen Prozesssteuerungsdiagramms, d.h. eines Energietest-basierten Kontrolldiagramms (ETCC).
Um die weitere Forschung und Praxis der nichtparametrischen multivariaten statistischen Prozesskontrolle zu unterstützen, die in dieser Dissertation entwickelt wurde, wird ein R-Paket "EnergyOnlineCPM" entwickelt. Dieses Paket wurde im Moment akzeptiert und veröffentlicht im Comprehensive R Archive Network (CRAN), welches das erste Paket ist, das die Verschiebung von Mittelwert und Kovarianz online überwachen kann. / This dissertation focuses on the high dimensional financial engineering, especially in dependence modeling and sequential surveillance.
In aspect of dependence modeling, an introduction of high dimensional copula concentrating on state-of-the-art research in copula is presented.
A more complex application in financial engineering using high dimensional copula is concentrated on the pricing of the portfolio-like credit derivative, i.e. credit default swap index (CDX) tranches. In this part, the convex combination of copulas is proposed in CDX tranche pricing with components stemming from elliptical copula family (Gaussian and Student-t), Archimedean copula family (Frank, Gumbel, Clayton and Joe) and hierarchical Archimedean copula family used in some publications.
In financial surveillance part, the chapter focuses on the monitoring of high dimensional portfolios (in 5, 29 and 90 dimensions) by development of a nonparametric multivariate statistical process control chart, i.e. energy test based control chart (ETCC).
In order to support the further research and practice of nonparametric multivariate statistical process control chart devised in this dissertation, an R package "EnergyOnlineCPM" is developed. At moment, this package has been accepted and published in the Comprehensive R Archive Network (CRAN), which is the first package that can online monitor the shift in mean and covariance jointly.
|
Page generated in 0.0316 seconds