Spelling suggestions: "subject:"bootstrap resampling"" "subject:"gbootstrap resampling""
1 |
Padrões de diversidade de aves e rede de interação mutualística ave-planta em mosaico floresta-campoCasas, Grasiela January 2015 (has links)
Estudos clássicos com diversidade taxonômica, apesar de serem essenciais, não consideram as diferenças funcionais entre as espécies de uma comunidade. A abordagem considerando atributos funcionais e diversidade funcional vem preenchendo esta lacuna. A compreensão da estrutura e dinâmica de interações mutualísticas também é um elemento essencial em estudos de biodiversidade, permitindo a investigação de mecanismos ecológicos e evolutivos. Porém, a maioria dos estudos com redes de interação disponíveis na bibliografia são pequenas em número de espécies e interações, e é possível que estes dados não tenham sido suficientemente amostrados. Além disto, estudos têm mostrado que muitas métricas utilizadas em análises de rede de interação são sensíveis ao esforço amostral e ao tamanho da rede. Os objetivos desta tese foram: 1) investigar a diversidade taxonômica (DT) e funcional (DF) de aves e os padrões de organização de espécies de aves em comunidades refletindo convergência de atributos (TCAP: Trait Convergence Assembly Patterns) ao longo de transições entre floresta e campo; 2) analisar a estrutura de redes de dispersão de sementes de plantas por aves, utilizando as métricas de rede aninhamento, modularidade, conectância e distribuição do grau; 3) desenvolver um método estatístico visando avaliar suficiência amostral para métricas de redes de interação usando o método bootstrap de reamostragem com reposição. A composição de espécies de aves diferiu entre os ambientes, indicando uma substituição de espécies ao longo da transição floresta-borda-campo. DT diferiu significativamente somente entre floresta e borda de floresta, enquanto que ambas diferiram significativamente do campo em relação à DF. DT e DF podem indicar diferentes processos de organização de comunidades ao longo de mosaicos floresta-campo. A correlação significativa entre TCAP e o gradiente floresta-campo indica que provavelmente mecanismos de nicho atuam na organização da comunidade de aves, associados a mudanças na estrutura do habitat ao longo da transição floresta-borda-campo agindo como filtros ecológicos. Redes de dispersão de sementes de plantas por aves aparentemente apresentam um processo comum de organização, independentemente das diferenças na intensidade de amostragem e continentes onde as 19 redes utilizadas foram amostradas. Usando reamostragem bootstrap, encontramos que suficiência amostral pode ser alcançada com diferentes tamanhos amostrais (número de eventos de interação) para o mesmo conjunto de dados, dependendo da métrica de rede utilizada. / Classic studies on taxonomic diversity, though essential, do not consider the functional differences between species in a community. Studies using functional traits and functional diversity are filling this gap. Understanding the structure and dynamics of mutualistic interactions is also essential for biodiversity studies and allows the investigation of ecological and evolutionary mechanisms. However, most networks published are small in the number of species and interactions, and they are likely to be under-sampled. In addition, studies have demonstrated that many network metrics are sensitive to both sampling effort and network size. The aims of this thesis were: 1) to investigate bird taxonomic diversity (TD), functional diversity (FD), and patterns of trait convergence (TCAP: Trait Convergence Assembly Patterns) across forest-grassland transitions; 2) to analyse the structure of seed-dispersal networks between plants and birds using the metrics of nestedness, modularity, connectance and degree distribution; 3) to develop a statistical framework to assess sampling sufficiency for some of the most widely used metrics in network ecology, based on methods of bootstrap resampling. Bird species composition indicated species turnover between forest, forest edge and grassland. Regarding TD, only forest and edges differed. FD was significantly different between grassland and forest, and between grassland and edges. TD and FD responded differently to environmental change from forest to grassland, since they may capture different processes of community assembly along such transitions. Trait-convergence assembly patterns indicated niche mechanisms underlying assembly of bird communities, linked to changes in habitat structure across forest-edge-grassland transitions acting as ecological filters. Seed dispersal mutualistic networks apparently show a common assembly process regardless differences in sampling methodology or continents where the 19 networks were sampled. Using bootstrap resampling we found that sampling sufficiency can be reached at different sample sizes (number of interaction events) for the same dataset, depending on the metric of interest.
|
2 |
Evaluating Variance of the Model Credibility IndexXiao, Yan 30 November 2007 (has links)
Model credibility index is defined to be a sample size under which the power of rejection equals 0.5. It applies goodness-of-fit testing thinking and uses a one-number summary statistic as an assessment tool in a false model world. The estimation of the model credibility index involves a bootstrap resampling technique. To assess the consistency of the estimator of model credibility index, we instead study the variance of the power achieved at a fixed sample size. An improved subsampling method is proposed to obtain an unbiased estimator of the variance of power. We present two examples to interpret the mechanics of building model credibility index and estimate its error in model selection. One example is two-way independent model by Pearson Chi-square test, and another example is multi-dimensional logistic regression model using likelihood ratio test.
|
3 |
Padrões de diversidade de aves e rede de interação mutualística ave-planta em mosaico floresta-campoCasas, Grasiela January 2015 (has links)
Estudos clássicos com diversidade taxonômica, apesar de serem essenciais, não consideram as diferenças funcionais entre as espécies de uma comunidade. A abordagem considerando atributos funcionais e diversidade funcional vem preenchendo esta lacuna. A compreensão da estrutura e dinâmica de interações mutualísticas também é um elemento essencial em estudos de biodiversidade, permitindo a investigação de mecanismos ecológicos e evolutivos. Porém, a maioria dos estudos com redes de interação disponíveis na bibliografia são pequenas em número de espécies e interações, e é possível que estes dados não tenham sido suficientemente amostrados. Além disto, estudos têm mostrado que muitas métricas utilizadas em análises de rede de interação são sensíveis ao esforço amostral e ao tamanho da rede. Os objetivos desta tese foram: 1) investigar a diversidade taxonômica (DT) e funcional (DF) de aves e os padrões de organização de espécies de aves em comunidades refletindo convergência de atributos (TCAP: Trait Convergence Assembly Patterns) ao longo de transições entre floresta e campo; 2) analisar a estrutura de redes de dispersão de sementes de plantas por aves, utilizando as métricas de rede aninhamento, modularidade, conectância e distribuição do grau; 3) desenvolver um método estatístico visando avaliar suficiência amostral para métricas de redes de interação usando o método bootstrap de reamostragem com reposição. A composição de espécies de aves diferiu entre os ambientes, indicando uma substituição de espécies ao longo da transição floresta-borda-campo. DT diferiu significativamente somente entre floresta e borda de floresta, enquanto que ambas diferiram significativamente do campo em relação à DF. DT e DF podem indicar diferentes processos de organização de comunidades ao longo de mosaicos floresta-campo. A correlação significativa entre TCAP e o gradiente floresta-campo indica que provavelmente mecanismos de nicho atuam na organização da comunidade de aves, associados a mudanças na estrutura do habitat ao longo da transição floresta-borda-campo agindo como filtros ecológicos. Redes de dispersão de sementes de plantas por aves aparentemente apresentam um processo comum de organização, independentemente das diferenças na intensidade de amostragem e continentes onde as 19 redes utilizadas foram amostradas. Usando reamostragem bootstrap, encontramos que suficiência amostral pode ser alcançada com diferentes tamanhos amostrais (número de eventos de interação) para o mesmo conjunto de dados, dependendo da métrica de rede utilizada. / Classic studies on taxonomic diversity, though essential, do not consider the functional differences between species in a community. Studies using functional traits and functional diversity are filling this gap. Understanding the structure and dynamics of mutualistic interactions is also essential for biodiversity studies and allows the investigation of ecological and evolutionary mechanisms. However, most networks published are small in the number of species and interactions, and they are likely to be under-sampled. In addition, studies have demonstrated that many network metrics are sensitive to both sampling effort and network size. The aims of this thesis were: 1) to investigate bird taxonomic diversity (TD), functional diversity (FD), and patterns of trait convergence (TCAP: Trait Convergence Assembly Patterns) across forest-grassland transitions; 2) to analyse the structure of seed-dispersal networks between plants and birds using the metrics of nestedness, modularity, connectance and degree distribution; 3) to develop a statistical framework to assess sampling sufficiency for some of the most widely used metrics in network ecology, based on methods of bootstrap resampling. Bird species composition indicated species turnover between forest, forest edge and grassland. Regarding TD, only forest and edges differed. FD was significantly different between grassland and forest, and between grassland and edges. TD and FD responded differently to environmental change from forest to grassland, since they may capture different processes of community assembly along such transitions. Trait-convergence assembly patterns indicated niche mechanisms underlying assembly of bird communities, linked to changes in habitat structure across forest-edge-grassland transitions acting as ecological filters. Seed dispersal mutualistic networks apparently show a common assembly process regardless differences in sampling methodology or continents where the 19 networks were sampled. Using bootstrap resampling we found that sampling sufficiency can be reached at different sample sizes (number of interaction events) for the same dataset, depending on the metric of interest.
|
4 |
Padrões de diversidade de aves e rede de interação mutualística ave-planta em mosaico floresta-campoCasas, Grasiela January 2015 (has links)
Estudos clássicos com diversidade taxonômica, apesar de serem essenciais, não consideram as diferenças funcionais entre as espécies de uma comunidade. A abordagem considerando atributos funcionais e diversidade funcional vem preenchendo esta lacuna. A compreensão da estrutura e dinâmica de interações mutualísticas também é um elemento essencial em estudos de biodiversidade, permitindo a investigação de mecanismos ecológicos e evolutivos. Porém, a maioria dos estudos com redes de interação disponíveis na bibliografia são pequenas em número de espécies e interações, e é possível que estes dados não tenham sido suficientemente amostrados. Além disto, estudos têm mostrado que muitas métricas utilizadas em análises de rede de interação são sensíveis ao esforço amostral e ao tamanho da rede. Os objetivos desta tese foram: 1) investigar a diversidade taxonômica (DT) e funcional (DF) de aves e os padrões de organização de espécies de aves em comunidades refletindo convergência de atributos (TCAP: Trait Convergence Assembly Patterns) ao longo de transições entre floresta e campo; 2) analisar a estrutura de redes de dispersão de sementes de plantas por aves, utilizando as métricas de rede aninhamento, modularidade, conectância e distribuição do grau; 3) desenvolver um método estatístico visando avaliar suficiência amostral para métricas de redes de interação usando o método bootstrap de reamostragem com reposição. A composição de espécies de aves diferiu entre os ambientes, indicando uma substituição de espécies ao longo da transição floresta-borda-campo. DT diferiu significativamente somente entre floresta e borda de floresta, enquanto que ambas diferiram significativamente do campo em relação à DF. DT e DF podem indicar diferentes processos de organização de comunidades ao longo de mosaicos floresta-campo. A correlação significativa entre TCAP e o gradiente floresta-campo indica que provavelmente mecanismos de nicho atuam na organização da comunidade de aves, associados a mudanças na estrutura do habitat ao longo da transição floresta-borda-campo agindo como filtros ecológicos. Redes de dispersão de sementes de plantas por aves aparentemente apresentam um processo comum de organização, independentemente das diferenças na intensidade de amostragem e continentes onde as 19 redes utilizadas foram amostradas. Usando reamostragem bootstrap, encontramos que suficiência amostral pode ser alcançada com diferentes tamanhos amostrais (número de eventos de interação) para o mesmo conjunto de dados, dependendo da métrica de rede utilizada. / Classic studies on taxonomic diversity, though essential, do not consider the functional differences between species in a community. Studies using functional traits and functional diversity are filling this gap. Understanding the structure and dynamics of mutualistic interactions is also essential for biodiversity studies and allows the investigation of ecological and evolutionary mechanisms. However, most networks published are small in the number of species and interactions, and they are likely to be under-sampled. In addition, studies have demonstrated that many network metrics are sensitive to both sampling effort and network size. The aims of this thesis were: 1) to investigate bird taxonomic diversity (TD), functional diversity (FD), and patterns of trait convergence (TCAP: Trait Convergence Assembly Patterns) across forest-grassland transitions; 2) to analyse the structure of seed-dispersal networks between plants and birds using the metrics of nestedness, modularity, connectance and degree distribution; 3) to develop a statistical framework to assess sampling sufficiency for some of the most widely used metrics in network ecology, based on methods of bootstrap resampling. Bird species composition indicated species turnover between forest, forest edge and grassland. Regarding TD, only forest and edges differed. FD was significantly different between grassland and forest, and between grassland and edges. TD and FD responded differently to environmental change from forest to grassland, since they may capture different processes of community assembly along such transitions. Trait-convergence assembly patterns indicated niche mechanisms underlying assembly of bird communities, linked to changes in habitat structure across forest-edge-grassland transitions acting as ecological filters. Seed dispersal mutualistic networks apparently show a common assembly process regardless differences in sampling methodology or continents where the 19 networks were sampled. Using bootstrap resampling we found that sampling sufficiency can be reached at different sample sizes (number of interaction events) for the same dataset, depending on the metric of interest.
|
5 |
The Single Imputation Technique in the Gaussian Mixture Model FrameworkAisyah, Binti M.J. January 2018 (has links)
Missing data is a common issue in data analysis. Numerous techniques have
been proposed to deal with the missing data problem. Imputation is the most
popular strategy for handling the missing data. Imputation for data analysis is
the process to replace the missing values with any plausible values. Two most
frequent imputation techniques cited in literature are the single imputation and
the multiple imputation.
The multiple imputation, also known as the golden imputation technique, has
been proposed by Rubin in 1987 to address the missing data. However, the
inconsistency is the major problem in the multiple imputation technique. The
single imputation is less popular in missing data research due to bias and less
variability issues. One of the solutions to improve the single imputation
technique in the basic regression model: the main motivation is that, the
residual is added to improve the bias and variability. The residual is drawn by
normal distribution assumption with a mean of 0, and the variance is equal to
the residual variance. Although new methods in the single imputation
technique, such as stochastic regression model, and hot deck imputation,
might be able to improve the variability and bias issues, the single imputation
techniques suffer with the uncertainty that may underestimate the R-square or
standard error in the analysis results.
The research reported in this thesis provides two imputation solutions for the
single imputation technique. In the first imputation procedure, the wild
bootstrap is proposed to improve the uncertainty for the residual variance in
the regression model. In the second solution, the predictive mean matching
(PMM) is enhanced, where the regression model is taking the main role to generate the recipient values while the observations in the donors are taken
from the observed values. Then the missing values are imputed by randomly
drawing one of the observations in the donor pool. The size of the donor pool
is significant to determine the quality of the imputed values. The fixed size of
donor is used to be employed in many existing research works with PMM
imputation technique, but might not be appropriate in certain circumstance
such as when the data distribution has high density region. Instead of using
the fixed size of donor pool, the proposed method applies the radius-based
solution to determine the size of donor pool. Both proposed imputation
procedures will be combined with the Gaussian mixture model framework to
preserve the original data distribution.
The results reported in the thesis from the experiments on benchmark and
artificial data sets confirm improvement for further data analysis. The proposed
approaches are therefore worthwhile to be considered for further investigation
and experiments.
|
6 |
Integração de redes neurais artificiais ao nariz eletrônico: avaliação aromática de café solúvelBona, Evandro January 2008 (has links)
No description available.
|
7 |
Integração de redes neurais artificiais ao nariz eletrônico: avaliação aromática de café solúvelBona, Evandro January 2008 (has links)
No description available.
|
8 |
Essays on asset allocation strategies for defined contribution plansBasu, Anup K. January 2008 (has links)
Asset allocation is the most influential factor driving investment performance. While researchers have made substantial progress in the field of asset allocation since the introduction of mean-variance framework by Markowitz, there is little agreement about appropriate portfolio choice for multi-period long horizon investors. Nowhere this is more evident than trustees of retirement plans choosing different asset allocation strategies as default investment options for their members. This doctoral dissertation consists of four essays each of which explores either a novel or an unresolved issue in the area of asset allocation for individual retirement plan participants. The goal of the thesis is to provide greater insight into the subject of portfolio choice in retirement plans and advance scholarship in this field. The first study evaluates different constant mix or fixed weight asset allocation strategies and comments on their relative appeal as default investment options. In contrast to past research which deals mostly with theoretical or hypothetical models of asset allocation, we investigate asset allocation strategies that are actually used as default investment options by superannuation funds in Australia. We find that strategies with moderate allocation to stocks are consistently outperformed in terms of upside potential of exceeding the participant’s wealth accumulation target as well as downside risk of falling below that target by very aggressive strategies whose allocation to stocks approach 100%. The risk of extremely adverse wealth outcomes for plan participants does not appear to be very sensitive to asset allocation. Drawing on the evidence of the previous study, the second essay explores possible solutions to the well known problem of gender inequality in retirement investment outcomes. Using non-parametric stochastic simulation, we simulate iv and compare the retirement wealth outcomes for a hypothetical female and male worker under different assumptions about breaks in employment, superannuation contribution rates, and asset allocation strategies. We argue that modest changes in contribution and asset allocation strategy for the female plan participant are necessary to ensure an equitable wealth outcome in retirement. The findings provide strong evidence against gender-neutral default contribution and asset allocation policy currently institutionalized in Australia and other countries. In the third study we examine the efficacy of lifecycle asset allocation models which allocate aggressively to risky asset classes when the employee participants are young and gradually switch to more conservative asset classes as they approach retirement. We show that the conventional lifecycle strategies make a costly mistake by ignoring the change in portfolio size over time as a critical input in the asset allocation decision. Due to this portfolio size effect, which has hitherto remained unexplored in literature, the terminal value of accumulation in retirement account is critically dependent on the asset allocation strategy adopted by the participant in later years relative to early years. The final essay extends the findings of the previous chapter by proposing an alternative approach to lifecycle asset allocation which incorporates performance feedback. We demonstrate that strategies that dynamically alter allocation between growth and conservative asset classes at different points on the investment horizon based on cumulative portfolio performance relative to a set target generally result in superior wealth outcomes compared to those of conventional lifecycle strategies. The dynamic allocation strategy exhibits clear second-degree stochastic dominance over conventional strategies which switch assets in a deterministic manner as well as balanced diversified strategies.
|
9 |
Expeditious Causal Inference for Big Observational DataYumin Zhang (13163253) 28 July 2022 (has links)
<p>This dissertation address two significant challenges in the causal inference workflow for Big Observational Data. The first is designing Big Observational Data with high-dimensional and heterogeneous covariates. The second is performing uncertainty quantification for estimates of causal estimands that are obtained from the application of black box machine learning algorithms on the designed Big Observational Data. The methodologies developed by addressing these challenges are applied for the design and analysis of Big Observational Data from a large public university in the United States. </p>
<h4>Distributed Design</h4>
<p>A fundamental issue in causal inference for Big Observational Data is confounding due to covariate imbalances between treatment groups. This can be addressed by designing the study prior to analysis. The design ensures that subjects in the different treatment groups that have comparable covariates are subclassified or matched together. Analyzing such a designed study helps to reduce biases arising from the confounding of covariates with treatment. Existing design methods, developed for traditional observational studies consisting of a single designer, can yield unsatisfactory designs with sub-optimum covariate balance for Big Observational Data due to their inability to accommodate the massive dimensionality, heterogeneity, and volume of the Big Data. We propose a new framework for the distributed design of Big Observational Data amongst collaborative designers. Our framework first assigns subsets of the high-dimensional and heterogeneous covariates to multiple designers. The designers then summarize their covariates into lower-dimensional quantities, share their summaries with the others, and design the study in parallel based on their assigned covariates and the summaries they receive. The final design is selected by comparing balance measures for all covariates across the candidates and identifying the best amongst the candidates. We perform simulation studies and analyze datasets from the 2016 Atlantic Causal Inference Conference Data Challenge to demonstrate the flexibility and power of our framework for constructing designs with good covariate balance from Big Observational Data.</p>
<h4>Designed Bootstrap</h4>
<p>The combination of modern machine learning algorithms with the nonparametric bootstrap can enable effective predictions and inferences on Big Observational Data. An increasingly prominent and critical objective in such analyses is to draw causal inferences from the Big Observational Data. A fundamental step in addressing this objective is to design the observational study prior to the application of machine learning algorithms. However, the application of the traditional nonparametric bootstrap on Big Observational Data requires excessive computational efforts. This is because every bootstrap sample would need to be re-designed under the traditional approach, which can be prohibitive in practice. We propose a design-based bootstrap for deriving causal inferences with reduced bias from the application of machine learning algorithms on Big Observational Data. Our bootstrap procedure operates by resampling from the original designed observational study. It eliminates the need for additional, costly design steps on each bootstrap sample that are performed under the standard nonparametric bootstrap. We demonstrate the computational efficiency of this procedure compared to the traditional nonparametric bootstrap, and its equivalency in terms of confidence interval coverage rates for the average treatment effects, by means of simulation studies and a real-life case study.</p>
<h4>Case Study</h4>
<p>We apply the distributed design and designed bootstrap methodologies in a case study involving institutional data from a large public university in the United States. The institutional data contains comprehensive information about the undergraduate students in the university, ranging from their academic records to on-campus activities. We study the causal effects of undergraduate students’ attempted course load on their academic performance based on a selection of covariates from these data. Ultimately, our real-life case study demonstrates how our methodologies enable researchers to effectively use straightforward design procedures to obtain valid causal inferences with reduced computational efforts from the application of machine learning algorithms on Big Observational Data.</p>
<p><br></p>
|
10 |
Newsvendor Models With Monte Carlo SamplingEkwegh, Ijeoma W 01 August 2016 (has links)
Newsvendor Models with Monte Carlo Sampling by Ijeoma Winifred Ekwegh The newsvendor model is used in solving inventory problems in which demand is random. In this thesis, we will focus on a method of using Monte Carlo sampling to estimate the order quantity that will either maximizes revenue or minimizes cost given that demand is uncertain. Given data, the Monte Carlo approach will be used in sampling data over scenarios and also estimating the probability density function. A bootstrapping process yields an empirical distribution for the order quantity that will maximize the expected profit. Finally, this method will be used on a newsvendor example to show that it works in maximizing profit.
|
Page generated in 0.1024 seconds