Spelling suggestions: "subject:"nonparametric inference"" "subject:"nonparametric lnference""
1 |
Nonparametric Belief Propagation and Facial Appearance EstimationSudderth, Erik B., Ihler, Alexander T., Freeman, William T., Willsky, Alan S. 01 December 2002 (has links)
In many applications of graphical models arising in computer vision, the hidden variables of interest are most naturally specified by continuous, non-Gaussian distributions. There exist inference algorithms for discrete approximations to these continuous distributions, but for the high-dimensional variables typically of interest, discrete inference becomes infeasible. Stochastic methods such as particle filters provide an appealing alternative. However, existing techniques fail to exploit the rich structure of the graphical models describing many vision problems. Drawing on ideas from regularized particle filters and belief propagation (BP), this paper develops a nonparametric belief propagation (NBP) algorithm applicable to general graphs. Each NBP iteration uses an efficient sampling procedure to update kernel-based approximations to the true, continuous likelihoods. The algorithm can accomodate an extremely broad class of potential functions, including nonparametric representations. Thus, NBP extends particle filtering methods to the more general vision problems that graphical models can describe. We apply the NBP algorithm to infer component interrelationships in a parts-based face model, allowing location and reconstruction of occluded features.
|
2 |
Estimação de funções do redshift de galáxias com base em dados fotométricos / Galaxies redshift function estimation using photometric dataFerreira, Gretta Rossi 18 September 2017 (has links)
Em uma quantidade substancial de problemas de astronomia, tem-se interesse na estimação do valor assumido, para diversas funções g, de alguma quantidade desconhecida z ∈ ℜ com base em covariáveis x ∈ ℜd. Isto é feito utilizando-se uma amostra (X1, Z1), ... (Xn, Zn). As duas abordagens usualmente utilizadas para resolver este problema consistem em (1) estimar a regressão de Z em x, e plugar esta na função g ou (2)estimar a densidade condicional f (z Ι x) e plugá-la em ∫ g(z) f (z Ι x)dz. Infelizmente, poucos estudos apresentam comparações quantitativas destas duas abordagens. Além disso, poucos métodos de estimação de densidade condicional tiveram seus desempenhos comparados nestes problemas. Em vista disso, o objetivo deste trabalho é apresentar diversas comparações de técnicas de estimação de funções de uma quantidade desconhecida. Em particular, damos destaque para métodos não paramétricos. Além dos estimadores (1) e (2), propomos também uma nova abordagem que consistem em estimar diretamente a função de regressão de g(Z) em x. Essas abordagens foram testadas em diferentes funções nos conjuntos de dados DEEP2 e Sheldon 2012. Para quase todas as funções testadas, o estimador (1) obteve os piores resultados, exceto quando utilizamos florestas aleatórias. Em diversos casos, a nova abordagem proposta apresentou melhores resultados, assim como o estimador (2). Em particular, verificamos que métodos via florestas aleatórias, em geral, levaram a bons resultados. / In a substantial a mount of astronomy problems, we are interested in estimating values assumed of some unknown quantity z ∈ ℜ, for many function g, based on covariates x ∈ ℜd. This is made using a sample (X1, Z1), ..., (Xn, Zn). Two approaches that are usually used to solve this problem consist in (1) estimating a regression function of Z in x and plugging it into the g or (2) estimating a conditional density f (z Ι x) and plugging it into ∫ g(z) f (z Ι x)dz. Unfortunately, few studies exhibit quantitative comparisons between these two approaches.Besides that, few conditional density estimation methods had their performance compared in these problems.In view of this, the objective of this work is to show several comparisons of techniques used to estimate functions of unknown quantity. In particular we highlight nonparametric methods. In addition to estimators (1) and (2), we also propose a new ap proach that consists in directly estimating the regression function from g(Z) on x. These approaches were tested in different functions in the DEEP 2 and Sheldon 2012 datasets. For almost all the functions tested, the estimator (1) obtained the worst results, except when we use the random forests methods. In several cases, the proposed new approach presented better results, as well as the estimator (2) .In particular, we verified that random forests methods generally present to good results.
|
3 |
Análise do aerossol atmosférico em Acra, capital de Gana / Analysis of atmospheric aerosol in Accra, capital of GhanaVerissimo, Thiago Gomes 10 June 2016 (has links)
Cidades dos países da África Subsariana (SSA) têm passado por um intenso processo de urbanização, implicando em crescimento das atividades econômicas em geral e industriais em particular, assim como, o aumento do tráfego de veículos e da produção de lixo, dentre outras mudanças que afetam diretamente o meio ambiente e a saúde dos habitantes. Neste cenário, a identificação de fontes poluidoras do ar é essencial para a fundamentação de políticas públicas que visam assegurar o direito a uma boa qualidade de vida para a população. Esta pesquisa de Mestrado esteve integrada a um projeto internacional denominado Energy, air pollution, and health in developing countries, coordenado pelo Dr. Majid Ezzati, à época professor da Harvard School of Public Health, e integrando também pesquisadores da Universidade de Gana. Este projeto tinha por objetivo fazer avaliações dos níveis de poluição do ar em algumas cidades de países em desenvolvimento, voltando-se, neste caso particular para Acra (capital de Gana e maior cidade da SSA), e duas outras cidades de Gambia, onde até então inexistiam estudos mais substantivos, relacionando-os com as condições socioeconômicas específicas das diferentes áreas estudadas. Contribuímos com as análises de Fluorescência de Raios X (XRF) e de Black Carbon (BC), com as discussões e interpretações dos dados meteorológicos e no emprego dos modelos receptores. Mas do ponto de vista do aprofundamento de estudos da qualidade do ar e do impacto de fontes, este trabalho concentrou-se na região de Nima, bairro da capital de Gana, Acra. A partir da caracterização do aerossol atmosférico local, empregou-se modelos receptores para identificar o perfil e contribuição de fontes majoritárias do Material Particulado Atmosférico Fino MP2,5 e Grosso MP2,5-10. Foram coletadas 791 amostras (de 48 horas) entre novembro de 2006 e agosto de 2008 em dois locais, na principal avenida do bairro, Nima Road, e na área residencial, Sam Road, distantes 250 metros entre si. A concentração anual média em 2007 para MP2,5 encontrada na avenida foi de 61,6 (1,0) ug/m3 e 44,9 (1,1) ug/m3 na área residencial, superando a diretriz de padrão anual máximo de 10 ug/m3 recomendada pela Organização Mundial de Saúde (OMS). A porcentagem de ultrapassagem do padrão diário (OMS) de 25 ug/m3 foi de 66,5% e 92% para a área residencial e avenida, respectivamente, durante todo experimento. As concentrações químicas elementares foram obtidas por XRF e o BC por refletância intercalibrada por Thermal Optical Transmitance (TOT). Neste trabalho desenvolvemos uma metodologia de calibração do XRF e de intercalibração entre refletância e TOT, baseada em Mínimos Quadrados Matricial, o que nos forneceu incerteza dos dados ajustados e boa precisão nos valores absolutos de concentrações medidos. Análise de Fatores (AF) e Positive Matrix Factorization (PMF) foram utilizadas para associação entre fonte e fator, bem como para estimar o perfil destas fontes. A avaliação de parâmetros meteorológicos locais, como direção e intensidade dos ventos e posicionamento de fontes significativas de emissão de MP auxiliaram no processo de associação dos fatores obtidos por esses modelos e fontes reais. No período do inverno em Gana, um vento provindo do deserto do Saara, que está localizado ao nordeste do país, denominado Harmatão, passa por Acra, aumentando de um fator 10 a concentração dos poluentes relacionados à poeira de solo. Assim, as amostras dos dias de ocorrências do Harmatão foram analisadas separadamente, pois dificultavam a identificação de outras fontes por PMF e AF. As fontes majoritárias indicadas por esses dois métodos (AF e PMF), mostraram-se concordantes: Mar (Na, Cl), solo (Fe, Ti, Mn, Si, Al, Ca, Mg), emissões veiculares (BC, Pb, Zn, K), queima de biomassa (K, P, S, BC) e queima de lixo sólido e outros materiais a céu aberto (Br, Pb) . A redução da poluição do ar em cidades da SSA, caso de Acra, requer políticas públicas relacionadas ao uso de energia, saúde, transporte e planejamento urbano, com devida atenção aos impactos nas comunidades pobres. Medidas como pavimentação das vias, cobertura do solo com vegetação, incentivo ao uso de gás de cozinha e incentivo ao transporte público, ajudariam a diminuir os altos índices de poluição do ar ambiental nessas cidades. / Sub-Saharan Africa (SSA) cities have been intense developing process, resulting in generalized economical activities growing, specially industrial, as well as increase in the vehicular traffic and waste generation, among other changes directly affecting the environment and public health. Therefore, identifying the air pollution sources is an essential issue for public decisions to assure people rights to healthy life. This Master work has been integrated to an international project called Energy, air pollution, and health in developing countries, under coordination of Dr. Majid Ezzati, then at the Harvard School of Public Health, grouping also researchers from the University of Ghana. The aims of this project were to evaluate the air pollution level at some developing countries, by this time devoted to Accra (the capital of Ghana and the main city of SSA), and two other cities of Gambia. Since then, no substantive study was performed there, connecting air pollution to the regional social-economical levels. This Master project, provided the XRF and Black Carbon determination for all samples of the main project, and gave, else, support for meteorological and receptor modeling issues. But concerning the improving of the study of air quality and sources impact, the work focus Nima town, at Accra, the Ghana Capital. The characterization of species in the local atmospheric aerosol was used in Receptor Models to make factor to sources profile association, and respective apportionment in the local PM2.5 and PM10. Between November/2006 and August/2008, 791 filters (sampled for 48 h) collected the local atmospheric aerosol, in two sites separated by 250 m. One was at the main avenue (Nima Road) and other in a residential street (Sam Road). The PM2.5 annual average concentration to 2007 was 61,6 (1,0) ug/m3 near to the avenue and 44,9 (1,1) ug/m3 in the residential area, surpassing ~5 times the Word Health Organization (WHO) guidelines to annual mean (10 ug/m3). Another WHO guideline is not surpass 25 ug/m3 in more than 1% of the samples collected in one year - in each of these sites, 66,5% and 92% of the samples are above this limit. X-Ray Fluorescence (XRF) provided the elemental concentrations, while reflectance, inter calibrated by Thermal Optical Transmittance (TOT), gave the Black Carbon (BC) levels. In this work we performed a methodology for the XRF calibration and for the inter calibration between TOT and reflectance, using Matrix Least Square Fitting that gives the uncertainties of fitted data and improves the precision of the adjusted values. Factor Analysis (FA) and Positive Matrix Factorization (PMF) enabled the association between source and the determined factors, as well as, estimated the sources profile. Local meteorological data, like wind intensity and direction, and the identification of some heavy MP emission sources, helped the process of factors to sources association. During the winter period (January-March), Accra received the Harmattan wind, blowing from Sahara deserts, that increased the concentrations from soil in 10 times. Therefore, the samples from this period were separately analyzed, providing better detection of the other source by PMF and FA. The local main source detected by both methods showed coherency: sea salt (Na, Cl), soil (Fe, Ti, Mn, Si, Al, Ca, Mg), vehicular emissions (BC, Pb, Zn, K) and biomass burning (K, P, S, BC). Reduction of air pollution levels in SSA cities, like Accra, requires public actions providing clean energy sources, health care, public transportation, urban planing and attention to they impact for the poor communities. Relatively simple providences, like roads paving, vegetation covering of the land, use of gas for cooking, public transportation, should decrease the high air pollution level in those cities.
|
4 |
Nonparametric Statistics on Manifolds With Applications to Shape SpacesBhattacharya, Abhishek January 2008 (has links)
This thesis presents certain recent methodologies and some new results for the statistical analysis of probability distributions on non-Euclidean manifolds. The notions of Frechet mean and variation as measures of center and spread are introduced and their properties are discussed. The sample estimates from a random sample are shown to be consistent under fairly broad conditions. Depending on the choice of distance on the manifold, intrinsic and extrinsic statistical analyses are carried out. In both cases, sufficient conditions are derived for the uniqueness of the population means and for the asymptotic normality of the sample estimates. Analytic expressions for the parameters in the asymptotic distributions are derived. The manifolds of particular interest in this thesis are the shape spaces of k-ads. The statistical analysis tools developed on general manifolds are applied to the spaces of direct similarity shapes, planar shapes, reflection similarity shapes, affine shapes and projective shapes. Two-sample nonparametric tests are constructed to compare the mean shapes and variation in shapes for two random samples. The samples in consideration can be either independent of each other or be the outcome of a matched pair experiment. The testing procedures are based on the asymptotic distribution of the test statistics, or on nonparametric bootstrap methods suitably constructed. Real life examples are included to illustrate the theory.
|
5 |
Análise do aerossol atmosférico em Acra, capital de Gana / Analysis of atmospheric aerosol in Accra, capital of GhanaThiago Gomes Verissimo 10 June 2016 (has links)
Cidades dos países da África Subsariana (SSA) têm passado por um intenso processo de urbanização, implicando em crescimento das atividades econômicas em geral e industriais em particular, assim como, o aumento do tráfego de veículos e da produção de lixo, dentre outras mudanças que afetam diretamente o meio ambiente e a saúde dos habitantes. Neste cenário, a identificação de fontes poluidoras do ar é essencial para a fundamentação de políticas públicas que visam assegurar o direito a uma boa qualidade de vida para a população. Esta pesquisa de Mestrado esteve integrada a um projeto internacional denominado Energy, air pollution, and health in developing countries, coordenado pelo Dr. Majid Ezzati, à época professor da Harvard School of Public Health, e integrando também pesquisadores da Universidade de Gana. Este projeto tinha por objetivo fazer avaliações dos níveis de poluição do ar em algumas cidades de países em desenvolvimento, voltando-se, neste caso particular para Acra (capital de Gana e maior cidade da SSA), e duas outras cidades de Gambia, onde até então inexistiam estudos mais substantivos, relacionando-os com as condições socioeconômicas específicas das diferentes áreas estudadas. Contribuímos com as análises de Fluorescência de Raios X (XRF) e de Black Carbon (BC), com as discussões e interpretações dos dados meteorológicos e no emprego dos modelos receptores. Mas do ponto de vista do aprofundamento de estudos da qualidade do ar e do impacto de fontes, este trabalho concentrou-se na região de Nima, bairro da capital de Gana, Acra. A partir da caracterização do aerossol atmosférico local, empregou-se modelos receptores para identificar o perfil e contribuição de fontes majoritárias do Material Particulado Atmosférico Fino MP2,5 e Grosso MP2,5-10. Foram coletadas 791 amostras (de 48 horas) entre novembro de 2006 e agosto de 2008 em dois locais, na principal avenida do bairro, Nima Road, e na área residencial, Sam Road, distantes 250 metros entre si. A concentração anual média em 2007 para MP2,5 encontrada na avenida foi de 61,6 (1,0) ug/m3 e 44,9 (1,1) ug/m3 na área residencial, superando a diretriz de padrão anual máximo de 10 ug/m3 recomendada pela Organização Mundial de Saúde (OMS). A porcentagem de ultrapassagem do padrão diário (OMS) de 25 ug/m3 foi de 66,5% e 92% para a área residencial e avenida, respectivamente, durante todo experimento. As concentrações químicas elementares foram obtidas por XRF e o BC por refletância intercalibrada por Thermal Optical Transmitance (TOT). Neste trabalho desenvolvemos uma metodologia de calibração do XRF e de intercalibração entre refletância e TOT, baseada em Mínimos Quadrados Matricial, o que nos forneceu incerteza dos dados ajustados e boa precisão nos valores absolutos de concentrações medidos. Análise de Fatores (AF) e Positive Matrix Factorization (PMF) foram utilizadas para associação entre fonte e fator, bem como para estimar o perfil destas fontes. A avaliação de parâmetros meteorológicos locais, como direção e intensidade dos ventos e posicionamento de fontes significativas de emissão de MP auxiliaram no processo de associação dos fatores obtidos por esses modelos e fontes reais. No período do inverno em Gana, um vento provindo do deserto do Saara, que está localizado ao nordeste do país, denominado Harmatão, passa por Acra, aumentando de um fator 10 a concentração dos poluentes relacionados à poeira de solo. Assim, as amostras dos dias de ocorrências do Harmatão foram analisadas separadamente, pois dificultavam a identificação de outras fontes por PMF e AF. As fontes majoritárias indicadas por esses dois métodos (AF e PMF), mostraram-se concordantes: Mar (Na, Cl), solo (Fe, Ti, Mn, Si, Al, Ca, Mg), emissões veiculares (BC, Pb, Zn, K), queima de biomassa (K, P, S, BC) e queima de lixo sólido e outros materiais a céu aberto (Br, Pb) . A redução da poluição do ar em cidades da SSA, caso de Acra, requer políticas públicas relacionadas ao uso de energia, saúde, transporte e planejamento urbano, com devida atenção aos impactos nas comunidades pobres. Medidas como pavimentação das vias, cobertura do solo com vegetação, incentivo ao uso de gás de cozinha e incentivo ao transporte público, ajudariam a diminuir os altos índices de poluição do ar ambiental nessas cidades. / Sub-Saharan Africa (SSA) cities have been intense developing process, resulting in generalized economical activities growing, specially industrial, as well as increase in the vehicular traffic and waste generation, among other changes directly affecting the environment and public health. Therefore, identifying the air pollution sources is an essential issue for public decisions to assure people rights to healthy life. This Master work has been integrated to an international project called Energy, air pollution, and health in developing countries, under coordination of Dr. Majid Ezzati, then at the Harvard School of Public Health, grouping also researchers from the University of Ghana. The aims of this project were to evaluate the air pollution level at some developing countries, by this time devoted to Accra (the capital of Ghana and the main city of SSA), and two other cities of Gambia. Since then, no substantive study was performed there, connecting air pollution to the regional social-economical levels. This Master project, provided the XRF and Black Carbon determination for all samples of the main project, and gave, else, support for meteorological and receptor modeling issues. But concerning the improving of the study of air quality and sources impact, the work focus Nima town, at Accra, the Ghana Capital. The characterization of species in the local atmospheric aerosol was used in Receptor Models to make factor to sources profile association, and respective apportionment in the local PM2.5 and PM10. Between November/2006 and August/2008, 791 filters (sampled for 48 h) collected the local atmospheric aerosol, in two sites separated by 250 m. One was at the main avenue (Nima Road) and other in a residential street (Sam Road). The PM2.5 annual average concentration to 2007 was 61,6 (1,0) ug/m3 near to the avenue and 44,9 (1,1) ug/m3 in the residential area, surpassing ~5 times the Word Health Organization (WHO) guidelines to annual mean (10 ug/m3). Another WHO guideline is not surpass 25 ug/m3 in more than 1% of the samples collected in one year - in each of these sites, 66,5% and 92% of the samples are above this limit. X-Ray Fluorescence (XRF) provided the elemental concentrations, while reflectance, inter calibrated by Thermal Optical Transmittance (TOT), gave the Black Carbon (BC) levels. In this work we performed a methodology for the XRF calibration and for the inter calibration between TOT and reflectance, using Matrix Least Square Fitting that gives the uncertainties of fitted data and improves the precision of the adjusted values. Factor Analysis (FA) and Positive Matrix Factorization (PMF) enabled the association between source and the determined factors, as well as, estimated the sources profile. Local meteorological data, like wind intensity and direction, and the identification of some heavy MP emission sources, helped the process of factors to sources association. During the winter period (January-March), Accra received the Harmattan wind, blowing from Sahara deserts, that increased the concentrations from soil in 10 times. Therefore, the samples from this period were separately analyzed, providing better detection of the other source by PMF and FA. The local main source detected by both methods showed coherency: sea salt (Na, Cl), soil (Fe, Ti, Mn, Si, Al, Ca, Mg), vehicular emissions (BC, Pb, Zn, K) and biomass burning (K, P, S, BC). Reduction of air pollution levels in SSA cities, like Accra, requires public actions providing clean energy sources, health care, public transportation, urban planing and attention to they impact for the poor communities. Relatively simple providences, like roads paving, vegetation covering of the land, use of gas for cooking, public transportation, should decrease the high air pollution level in those cities.
|
6 |
Estimação de funções do redshift de galáxias com base em dados fotométricos / Galaxies redshift function estimation using photometric dataGretta Rossi Ferreira 18 September 2017 (has links)
Em uma quantidade substancial de problemas de astronomia, tem-se interesse na estimação do valor assumido, para diversas funções g, de alguma quantidade desconhecida z ∈ ℜ com base em covariáveis x ∈ ℜd. Isto é feito utilizando-se uma amostra (X1, Z1), ... (Xn, Zn). As duas abordagens usualmente utilizadas para resolver este problema consistem em (1) estimar a regressão de Z em x, e plugar esta na função g ou (2)estimar a densidade condicional f (z Ι x) e plugá-la em ∫ g(z) f (z Ι x)dz. Infelizmente, poucos estudos apresentam comparações quantitativas destas duas abordagens. Além disso, poucos métodos de estimação de densidade condicional tiveram seus desempenhos comparados nestes problemas. Em vista disso, o objetivo deste trabalho é apresentar diversas comparações de técnicas de estimação de funções de uma quantidade desconhecida. Em particular, damos destaque para métodos não paramétricos. Além dos estimadores (1) e (2), propomos também uma nova abordagem que consistem em estimar diretamente a função de regressão de g(Z) em x. Essas abordagens foram testadas em diferentes funções nos conjuntos de dados DEEP2 e Sheldon 2012. Para quase todas as funções testadas, o estimador (1) obteve os piores resultados, exceto quando utilizamos florestas aleatórias. Em diversos casos, a nova abordagem proposta apresentou melhores resultados, assim como o estimador (2). Em particular, verificamos que métodos via florestas aleatórias, em geral, levaram a bons resultados. / In a substantial a mount of astronomy problems, we are interested in estimating values assumed of some unknown quantity z ∈ ℜ, for many function g, based on covariates x ∈ ℜd. This is made using a sample (X1, Z1), ..., (Xn, Zn). Two approaches that are usually used to solve this problem consist in (1) estimating a regression function of Z in x and plugging it into the g or (2) estimating a conditional density f (z Ι x) and plugging it into ∫ g(z) f (z Ι x)dz. Unfortunately, few studies exhibit quantitative comparisons between these two approaches.Besides that, few conditional density estimation methods had their performance compared in these problems.In view of this, the objective of this work is to show several comparisons of techniques used to estimate functions of unknown quantity. In particular we highlight nonparametric methods. In addition to estimators (1) and (2), we also propose a new ap proach that consists in directly estimating the regression function from g(Z) on x. These approaches were tested in different functions in the DEEP 2 and Sheldon 2012 datasets. For almost all the functions tested, the estimator (1) obtained the worst results, except when we use the random forests methods. In several cases, the proposed new approach presented better results, as well as the estimator (2) .In particular, we verified that random forests methods generally present to good results.
|
7 |
Partition clustering of High Dimensional Low Sample Size data based on P-ValuesVon Borries, George Freitas January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Haiyan Wang / This thesis introduces a new partitioning algorithm to cluster variables in high dimensional low sample size (HDLSS) data and high dimensional longitudinal low sample size (HDLLSS) data. HDLSS data contain a large number of variables with small number of replications per variable, and HDLLSS data refer to HDLSS data observed over time.
Clustering technique plays an important role in analyzing high dimensional low sample size data as is seen commonly in microarray experiment, mass spectrometry data, pattern recognition. Most current clustering algorithms for HDLSS and HDLLSS data are adaptations from traditional multivariate analysis, where the number of variables is not high and sample sizes are relatively large. Current algorithms show poor performance when applied to high dimensional data, especially in small sample size cases. In addition, available algorithms often exhibit poor clustering accuracy and stability for non-normal data. Simulations show that traditional clustering algorithms used in high dimensional data are not robust to monotone transformations.
The proposed clustering algorithm PPCLUST is a powerful tool for clustering HDLSS data, which uses p-values from nonparametric rank tests of homogeneous distribution as a measure of similarity between groups of variables. Inherited from the robustness of rank procedure, the new algorithm is robust to outliers and invariant to monotone transformations of data. PPCLUSTEL is an extension of PPCLUST for clustering of HDLLSS data. A nonparametric test of no simple effect of group is developed and the p-value from the test is used as a measure of similarity between groups of variables.
PPCLUST and PPCLUSTEL are able to cluster a large number of variables in the presence of very few replications and in case of PPCLUSTEL, the algorithm require neither a large number nor equally spaced time points. PPCLUST and PPCLUSTEL do not suffer from loss of power due to distributional assumptions, general multiple comparison problems and difficulty in controlling heterocedastic variances. Applications with available data from previous microarray studies show promising results and simulations studies reveal that the algorithm outperforms a series of benchmark algorithms applied to HDLSS data exhibiting high clustering accuracy and stability.
|
8 |
Jump estimation for noisy blurred step functions / Sprungschätzung für verrauschte Beobachtungen von verschmierten TreppenfunktionenBoysen, Leif 09 May 2006 (has links)
No description available.
|
9 |
Inference of nonparametric hypothesis testing on high dimensional longitudinal data and its application in DNA copy number variation and micro array data analysisZhang, Ke January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Haiyan Wang / High throughput screening technologies have generated a huge amount of
biological data in the last ten years. With the easy availability of
array technology, researchers started to investigate biological
mechanisms using experiments with more sophisticated designs that pose novel challenges to
statistical analysis. We provide theory for robust statistical tests in three flexible
models. In the first model, we consider the hypothesis testing
problems when there are a large number of variables observed
repeatedly over time. A potential application is in tumor genomics
where an
array comparative genome hybridization (aCGH) study will be used to
detect progressive DNA copy number changes in tumor development. In
the second model, we consider hypothesis testing theory in a
longitudinal microarray study when there are multiple treatments or experimental conditions.
The tests developed can be used to
detect treatment effects for a large group of genes and discover genes that respond to treatment over
time. In the third model, we address a hypothesis testing problem that could
arise when array data from different sources are to be integrated. We
perform statistical tests by assuming a nested design. In all
models, robust test statistics were constructed based on moment methods allowing unbalanced design and arbitrary heteroscedasticity. The limiting
distributions were derived under the nonclassical setting when the number of probes is large. The
test statistics are not targeted at a single probe. Instead, we are
interested in testing for a selected set of probes simultaneously.
Simulation studies were carried out to compare the proposed methods with
some traditional tests using linear mixed-effects
models and generalized estimating equations. Interesting results obtained with the proposed theory in two
cancer genomic studies suggest that the new methods are promising for a wide range of biological applications with longitudinal arrays.
|
10 |
Modélisation stochastique de processus pharmaco-cinétiques, application à la reconstruction tomographique par émission de positrons (TEP) spatio-temporelle / Stochastic modeling of pharmaco-kinetic processes, applied to PET space-time reconstructionFall, Mame Diarra 09 March 2012 (has links)
L'objectif de ce travail est de développer de nouvelles méthodes statistiques de reconstruction d'image spatiale (3D) et spatio-temporelle (3D+t) en Tomographie par Émission de Positons (TEP). Le but est de proposer des méthodes efficaces, capables de reconstruire des images dans un contexte de faibles doses injectées tout en préservant la qualité de l'interprétation. Ainsi, nous avons abordé la reconstruction sous la forme d'un problème inverse spatial et spatio-temporel (à observations ponctuelles) dans un cadre bayésien non paramétrique. La modélisation bayésienne fournit un cadre pour la régularisation du problème inverse mal posé au travers de l'introduction d'une information dite a priori. De plus, elle caractérise les grandeurs à estimer par leur distribution a posteriori, ce qui rend accessible la distribution de l'incertitude associée à la reconstruction. L'approche non paramétrique quant à elle pourvoit la modélisation d'une grande robustesse et d'une grande flexibilité. Notre méthodologie consiste à considérer l'image comme une densité de probabilité dans (pour une reconstruction en k dimensions) et à chercher la solution parmi l'ensemble des densités de probabilité de . La grande dimensionalité des données à manipuler conduit à des estimateurs n'ayant pas de forme explicite. Cela implique l'utilisation de techniques d'approximation pour l'inférence. La plupart de ces techniques sont basées sur les méthodes de Monte-Carlo par chaînes de Markov (MCMC). Dans l'approche bayésienne non paramétrique, nous sommes confrontés à la difficulté majeure de générer aléatoirement des objets de dimension infinie sur un calculateur. Nous avons donc développé une nouvelle méthode d'échantillonnage qui allie à la fois bonnes capacités de mélange et possibilité d'être parallélisé afin de traiter de gros volumes de données. L'approche adoptée nous a permis d'obtenir des reconstructions spatiales 3D sans nécessiter de voxellisation de l'espace, et des reconstructions spatio-temporelles 4D sans discrétisation en amont ni dans l'espace ni dans le temps. De plus, on peut quantifier l'erreur associée à l'estimation statistique au travers des intervalles de crédibilité. / The aim of this work is to develop new statistical methods for spatial (3D) and space-time (3D+t) Positron Emission Tomography (PET) reconstruction. The objective is to propose efficient reconstruction methods in a context of low injected doses while maintaining the quality of the interpretation. We tackle the reconstruction problem as a spatial or a space-time inverse problem for point observations in a \Bayesian nonparametric framework. The Bayesian modeling allows to regularize the ill-posed inverse problem via the introduction of a prior information. Furthermore, by characterizing the unknowns with their posterior distributions, the Bayesian context allows to handle the uncertainty associated to the reconstruction process. Being nonparametric offers a framework for robustness and flexibility to perform the modeling. In the proposed methodology, we view the image to reconstruct as a probability density in(for reconstruction in k dimensions) and seek the solution in the space of whole probability densities in . However, due to the size of the data, posterior estimators are intractable and approximation techniques are needed for posterior inference. Most of these techniques are based on Markov Chain Monte-Carlo methods (MCMC). In the Bayesian nonparametric approach, a major difficulty raises in randomly sampling infinite dimensional objects in a computer. We have developed a new sampling method which combines both good mixing properties and the possibility to be implemented on a parallel computer in order to deal with large data sets. Thanks to the taken approach, we obtain 3D spatial reconstructions without any ad hoc space voxellization and 4D space-time reconstructions without any discretization, neither in space nor in time. Furthermore, one can quantify the error associated to the statistical estimation using the credibility intervals.
|
Page generated in 0.0895 seconds