Spelling suggestions: "subject:"gaussian box"" "subject:"gaussian cox""
1 |
Bayesian Analysis of Spatial Point PatternsLeininger, Thomas Jeffrey January 2014 (has links)
<p>We explore the posterior inference available for Bayesian spatial point process models. In the literature, discussion of such models is usually focused on model fitting and rejecting complete spatial randomness, with model diagnostics and posterior inference often left as an afterthought. Posterior predictive point patterns are shown to be useful in performing model diagnostics and model selection, as well as providing a wide array of posterior model summaries. We prescribe Bayesian residuals and methods for cross-validation and model selection for Poisson processes, log-Gaussian Cox processes, Gibbs processes, and cluster processes. These novel approaches are demonstrated using existing datasets and simulation studies.</p> / Dissertation
|
2 |
Modely kótovaných bodových procesů / Models of marked point processesHéda, Ivan January 2016 (has links)
Title: Models of Marked Point Processes Author: Ivan Héda Department: Department of Probability and Mathematical Statistics Supervisor: doc. RNDr. Zbyněk Pawlas, Ph.D. Abstract: In the first part of the thesis, we present necessary theoretical basics as well as the definition of functional characteristics used for examination of marked point patterns. Second part is dedicated to review some known marking strategies. The core of the thesis lays in the study of intensity-marked point processes. General formula for the characteristics is proven for this marking strategy and general class of the models with analytically computable characteristics is introduced. This class generalizes some known models. Theoretical results are used for real data analysis in the last part of the thesis. Keywords: marked point process, marked log-Gaussian Cox process, intensity-marked point process 1
|
3 |
Relações entre fatores ambientais e espécies florestais por metodologias de processos pontuais / Relationship between environmental factors and forest species using points process methodologiesFrade, Djair Durand Ramalho 31 January 2014 (has links)
O padrão espacial de espécies em florestas nativas pode fornecer evidências sobre a estrutura da comunidade vegetal. Fatores ambientais podem influenciar o padrão espacial das espécies, como as características edáficas e processos que dependem da densidade, como competição intra e interespecífica. Desse modo, a pesquisa da relação entre as características ambientais e o padrão espacial de espécies florestais pode ajudar a entender a dinâmica de florestas. O objetivo deste estudo foi empregar técnicas da análise de processos pontuais para verificar o efeito de fatores ambientais sobre a ocorrência de espécies florestais. A área de estudo foi a Estação Ecológica de Assis (EEA), da unidade de Conservação do Estado de São Paulo em parcelas permanentes, dentro do projeto \"Diversidade, dinâmica e conservação em florestas do Estado de São Paulo: 40 ha de parcelas permanentes\" do programa Biota da FAPESP. A descrição do padrão espacial das espécies mais abundantes na área de estudo foi avaliada pela função K proposta por Ripley e suas extensões para processo não homogêneos, por meio das coordenadas geográficas das espécies com circunferência na altura do peito igual ou superior a 15 cm. Modelos do Processo Poisson Homogêneo, Processo Poisson Não Homogêneos e do Processo Log Gaussiano de Cox foram ajustados para cada espécie. Foi utilizado o critério de AIC para selecionar o modelo que melhor se ajusta aos dados. Testes de diagnósticos dos modelos foram feitos utilizando a função K não homogênea sob a hipótese de Completa Aleatoriedade Espacial. Os resultados indicaram que as espécies mais abundantes na EEA apresentam um padrão de distribuição agregado, ou seja, o número esperado de indivíduos próximos de um evento qualquer é maior do que esperado para uma distribuição aleatória. Conforme esperado, os fatores ambientais desempenharam um importante papel para explicar a distribuição espacial das espécies, porém, os resultados indicaram que existe uma variação espacialmente estruturada que não foi incluída na análise que é imprescindível para um bom ajuste dos modelos. Portanto os resultados sugerem que outros fatores não incluídos nos modelos e dados disponíveis podem estar determinando os padrões espaciais além das (co)variáveis medidas. / The spatial pattern of species in native forests may provide evidence on the structure of the plant community. Environmental factors may influence the species\' spatial patterns, as well as soil characteristics and processes which depend on the density, as intraspecific and interspecific competition. Therefore, researching the relationship among the environmental features and the spatial pattern of the forest species may aid in understanding forest dynamics. The goal of this study was to apply point process techniques to verify the effect of environmental factors on the occurence of forest species. The study area was the \"Assis\'s Ecological Station\" (AES), of the \"Unit of conservation of the state of São Paulo in permanent plots\". The data was collected as part of the project entitled \"Diversity, dynamics and conservation in forests of São Paulo state: 40 ha of permanent plots\", from FAPESP\'s Biota program. The description of the spatial pattern of the most abundant species in the study area was assessed using Ripley\'s K function, using the species\' geographic coordinates with circumference at chest height equal or larger than 15 cm. Homogeneous and Non-Homogeneous Poisson Process models, as well as Cox Log Gaussian Process models were fitted to each species. Model selection was made using the Akaike information criterion. Diagnostics tests were made using the non-homogeneous K function under the hypothesis of complete spatial randomness. Results suggested that the most abundant species in the AES present an aggregate distribution pattern, i.e., the expected number of individuals next to any event is larger than the expected by a random distribution. As it was expected, environmental factors played a major role in explaining the spatial distribution of the species. However, results suggested that there is a spatially structured variation that was not included in the analysis and is needed to a good model fit. Therefore, further studies are needed to assess which environmental feature which was not considered in this study presents an effect on the occurence of these forest species
|
4 |
Relações entre fatores ambientais e espécies florestais por metodologias de processos pontuais / Relationship between environmental factors and forest species using points process methodologiesDjair Durand Ramalho Frade 31 January 2014 (has links)
O padrão espacial de espécies em florestas nativas pode fornecer evidências sobre a estrutura da comunidade vegetal. Fatores ambientais podem influenciar o padrão espacial das espécies, como as características edáficas e processos que dependem da densidade, como competição intra e interespecífica. Desse modo, a pesquisa da relação entre as características ambientais e o padrão espacial de espécies florestais pode ajudar a entender a dinâmica de florestas. O objetivo deste estudo foi empregar técnicas da análise de processos pontuais para verificar o efeito de fatores ambientais sobre a ocorrência de espécies florestais. A área de estudo foi a Estação Ecológica de Assis (EEA), da unidade de Conservação do Estado de São Paulo em parcelas permanentes, dentro do projeto \"Diversidade, dinâmica e conservação em florestas do Estado de São Paulo: 40 ha de parcelas permanentes\" do programa Biota da FAPESP. A descrição do padrão espacial das espécies mais abundantes na área de estudo foi avaliada pela função K proposta por Ripley e suas extensões para processo não homogêneos, por meio das coordenadas geográficas das espécies com circunferência na altura do peito igual ou superior a 15 cm. Modelos do Processo Poisson Homogêneo, Processo Poisson Não Homogêneos e do Processo Log Gaussiano de Cox foram ajustados para cada espécie. Foi utilizado o critério de AIC para selecionar o modelo que melhor se ajusta aos dados. Testes de diagnósticos dos modelos foram feitos utilizando a função K não homogênea sob a hipótese de Completa Aleatoriedade Espacial. Os resultados indicaram que as espécies mais abundantes na EEA apresentam um padrão de distribuição agregado, ou seja, o número esperado de indivíduos próximos de um evento qualquer é maior do que esperado para uma distribuição aleatória. Conforme esperado, os fatores ambientais desempenharam um importante papel para explicar a distribuição espacial das espécies, porém, os resultados indicaram que existe uma variação espacialmente estruturada que não foi incluída na análise que é imprescindível para um bom ajuste dos modelos. Portanto os resultados sugerem que outros fatores não incluídos nos modelos e dados disponíveis podem estar determinando os padrões espaciais além das (co)variáveis medidas. / The spatial pattern of species in native forests may provide evidence on the structure of the plant community. Environmental factors may influence the species\' spatial patterns, as well as soil characteristics and processes which depend on the density, as intraspecific and interspecific competition. Therefore, researching the relationship among the environmental features and the spatial pattern of the forest species may aid in understanding forest dynamics. The goal of this study was to apply point process techniques to verify the effect of environmental factors on the occurence of forest species. The study area was the \"Assis\'s Ecological Station\" (AES), of the \"Unit of conservation of the state of São Paulo in permanent plots\". The data was collected as part of the project entitled \"Diversity, dynamics and conservation in forests of São Paulo state: 40 ha of permanent plots\", from FAPESP\'s Biota program. The description of the spatial pattern of the most abundant species in the study area was assessed using Ripley\'s K function, using the species\' geographic coordinates with circumference at chest height equal or larger than 15 cm. Homogeneous and Non-Homogeneous Poisson Process models, as well as Cox Log Gaussian Process models were fitted to each species. Model selection was made using the Akaike information criterion. Diagnostics tests were made using the non-homogeneous K function under the hypothesis of complete spatial randomness. Results suggested that the most abundant species in the AES present an aggregate distribution pattern, i.e., the expected number of individuals next to any event is larger than the expected by a random distribution. As it was expected, environmental factors played a major role in explaining the spatial distribution of the species. However, results suggested that there is a spatially structured variation that was not included in the analysis and is needed to a good model fit. Therefore, further studies are needed to assess which environmental feature which was not considered in this study presents an effect on the occurence of these forest species
|
5 |
Statistical methods for variant discovery and functional genomic analysis using next-generation sequencing dataTang, Man 03 January 2020 (has links)
The development of high-throughput next-generation sequencing (NGS) techniques produces massive amount of data, allowing the identification of biomarkers in early disease diagnosis and driving the transformation of most disciplines in biology and medicine. A greater concentration is needed in developing novel, powerful, and efficient tools for NGS data analysis. This dissertation focuses on modeling ``omics'' data in various NGS applications with a primary goal of developing novel statistical methods to identify sequence variants, find transcription factor (TF) binding patterns, and decode the relationship between TF and gene expression levels. Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in NGS applications. Existing methods for calling these variants often make simplified assumption of positional independence and fail to leverage the dependence of genotypes at nearby loci induced by linkage disequilibrium. We propose vi-HMM, a hidden Markov model (HMM)-based method for calling SNPs and INDELs in mapped short read data. Simulation experiments show that, under various sequencing depths, vi-HMM outperforms existing methods in terms of sensitivity and F1 score. When applied to the human whole genome sequencing data, vi-HMM demonstrates higher accuracy in calling SNPs and INDELs. One important NGS application is chromatin immunoprecipitation followed by sequencing (ChIP-seq), which characterizes protein-DNA relations through genome-wide mapping of TF binding sites. Multiple TFs, binding to DNA sequences, often show complex binding patterns, which indicate how TFs with similar functionalities work together to regulate the expression of target genes. To help uncover the transcriptional regulation mechanism, we propose a novel nonparametric Bayesian method to detect the clustering pattern of multiple-TF bindings from ChIP-seq datasets. Simulation study demonstrates that our method performs best with regard to precision, recall, and F1 score, in comparison to traditional methods. We also apply the method on real data and observe several TF clusters that have been recognized previously in mouse embryonic stem cells. Recent advances in ChIP-seq and RNA sequencing (RNA-Seq) technologies provides more reliable and accurate characterization of TF binding sites and gene expression measurements, which serves as a basis to study the regulatory functions of TFs on gene expression. We propose a log Gaussian cox process with wavelet-based functional model to quantify the relationship between TF binding site locations and gene expression levels. Through the simulation study, we demonstrate that our method performs well, especially with large sample size and small variance. It also shows a remarkable ability to distinguish real local feature in the function estimates. / Doctor of Philosophy / The development of high-throughput next-generation sequencing (NGS) techniques produces massive amount of data and bring out innovations in biology and medicine. A greater concentration is needed in developing novel, powerful, and efficient tools for NGS data analysis. In this dissertation, we mainly focus on three problems closely related to NGS and its applications: (1) how to improve variant calling accuracy, (2) how to model transcription factor (TF) binding patterns, and (3) how to quantify of the contribution of TF binding on gene expression. We develop novel statistical methods to identify sequence variants, find TF binding patterns, and explore the relationship between TF binding and gene expressions. We expect our findings will be helpful in promoting a better understanding of disease causality and facilitating the design of personalized treatments.
|
6 |
Heterogeneous Sensor Data based Online Quality Assurance for Advanced Manufacturing using Spatiotemporal ModelingLiu, Jia 21 August 2017 (has links)
Online quality assurance is crucial for elevating product quality and boosting process productivity in advanced manufacturing. However, the inherent complexity of advanced manufacturing, including nonlinear process dynamics, multiple process attributes, and low signal/noise ratio, poses severe challenges for both maintaining stable process operations and establishing efficacious online quality assurance schemes.
To address these challenges, four different advanced manufacturing processes, namely, fused filament fabrication (FFF), binder jetting, chemical mechanical planarization (CMP), and the slicing process in wafer production, are investigated in this dissertation for applications of online quality assurance, with utilization of various sensors, such as thermocouples, infrared temperature sensors, accelerometers, etc. The overarching goal of this dissertation is to develop innovative integrated methodologies tailored for these individual manufacturing processes but addressing their common challenges to achieve satisfying performance in online quality assurance based on heterogeneous sensor data. Specifically, three new methodologies are created and validated using actual sensor data, namely,
(1) Real-time process monitoring methods using Dirichlet process (DP) mixture model for timely detection of process changes and identification of different process states for FFF and CMP. The proposed methodology is capable of tackling non-Gaussian data from heterogeneous sensors in these advanced manufacturing processes for successful online quality assurance.
(2) Spatial Dirichlet process (SDP) for modeling complex multimodal wafer thickness profiles and exploring their clustering effects. The SDP-based statistical control scheme can effectively detect out-of-control wafers and achieve wafer thickness quality assurance for the slicing process with high accuracy.
(3) Augmented spatiotemporal log Gaussian Cox process (AST-LGCP) quantifying the spatiotemporal evolution of porosity in binder jetting parts, capable of predicting high-risk areas on consecutive layers. This work fills the long-standing research gap of lacking rigorous layer-wise porosity quantification for parts made by additive manufacturing (AM), and provides the basis for facilitating corrective actions for product quality improvements in a prognostic way.
These developed methodologies surmount some common challenges of advanced manufacturing which paralyze traditional methods in online quality assurance, and embody key components for implementing effective online quality assurance with various sensor data. There is a promising potential to extend them to other manufacturing processes in the future. / Ph. D. / This dissertation work develops novel online quality assurance methodologies for advanced manufacturing using various sensor data. Four advanced manufacturing processes, including fused filament fabrication, binder jetting, chemical mechanical planarization, and wafer slicing process, are investigated in this research. The developed methodologies address some common challenges in the aforementioned processes, such as nonlinear process dynamics and high variety in sensor data dimensions, which have severely hindered the effectiveness of traditional online quality assurance methods. Consequently, the proposed research accomplishes satisfying performance in defect detection and quality prediction for the advanced manufacturing processes.
In this dissertation, the research methodologies are constructed in both space and time domains based on different types of sensor data. Sensor data representation and integration for a variety of data formats (e.g., online data stream, profile data, image data) with the dimensionality covering a wide range (from ~100 to ~105 ) are researched to extract effective features that are sensitive to manufacturing process defects; the devised methods, based on the extracted features, utilize spatiotemporal analysis to realize timely detection and accurate prediction of process defects. These integrated methodologies have a promising potential to be extended to other advanced manufacturing processes for efficacious process monitoring and quality assurance.
The accomplished work in this dissertation is an effective effort towards sustainable operations of advanced manufacturing. The achieved performance not only enables improvement in defect detection and quality prediction, but also lays the foundation for future implementation of corrective actions that can automatically mitigate the process defects.
|
Page generated in 0.0447 seconds