Spelling suggestions: "subject:"bayes"" "subject:"hayes""
241 |
What Men Want, What They Get and How to Find OutWolf, Alexander 12 July 2017 (has links) (PDF)
This thesis is concerned with a fundamental unit of the economy: Households. Even in advanced economies, upwards of 70% of the population live in households composed of multiple people. A large number of decisions are taken at the level of the household, that is to say, they are taken jointly by household members: How to raise children, how much and when to work, how many cartons of milk to purchase. How these decisions are made is therefore of great importance for the people who live in them and for their well-being.But precisely because household members make decisions jointly it is hard to know how they come about and to what extent they benefit individual members. This is why households are often viewed as unique decision makers in economics. Even if they contain multiple people, they are treated as though they were a single person with a single set of preferences. This unitary approach is often sufficient and can be a helpful simplification. But in many situations it does not deliver an adequate description of household behavior. For instance, the unitary model does not permit the study of individual wellbeing and inequality inside the household. In addition, implications of the unitary model have been rejected repeatedly in the demand literature.Bargaining models offer an alternative where household members have individual preferences and come to joint decisions in various ways. There are by now a great number of such models, all of which allow for the study of bargaining power, a measure of the influence a member has in decision making. This concept is important because it has implications for the welfare of individuals. If one household member’s bargaining power increases, the household’s choices will be more closely aligned with that member’s preferences, ceteris paribus.The three chapters below can be divided into two parts. The first part consists of Chapter 1, which looks to detect the influence of intra-household bargaining in a specific set of consumption choices: Consumption of the arts. The research in this chapter is designed to measure aspects of the effect of bargaining power in this domain, but does not seek to quantify bargaining power itself or to infer economic well-being of household members.Precisely this last point, however, is the focus of the second part of the thesis, consisting of Chapters 2 and 3. These focus specifically on the recovery of one measure of bargaining power, the resource share. Resource shares have the advantage of being interpretable in terms of economic well-being, which is not true of all such measures. They are estimated as part of structural models of household demand. These models are versions of the collective model of household decision making.Pioneered by Chiappori (1988) and Apps and Rees (1988), the collective model has become the go-to alternative to unitary approaches, where the household is seen as a single decision-making unit with a single well-behaved utility function. Instead, the collective model allows for individual utility functions for each member of the household. The model owes much of its success to the simplicity of its most fundamental assumption: That whatever the structure of the intra-household bargaining process, outcomes are Pareto-efficient. This means that no member can be made better off, without making another worse off. Though the model nests unitary models as special cases, it does have testable implications.The first chapter of the thesis is entitled “Household Decisions on Arts Consumption” and is joint work with Caterina Mauri, who has also collaborated with me on many other projects in her capacity as my girlfriend. In it, we explore the role of intra-household bargaining in arts consumption. We do this by estimating demand for various arts and cultural events such as the opera or dance performances using a large number of explanatory variables. One of these variables plays a special role. This variable is a distribution factor, meaning that it can be reasonably assumed to affect consumption only through the bargaining process, and not by modifying preferences. Such variables play an important role in the household bargaining literature. Here, three such variables are used. Among them is the share of household income that is contributed by the husband, the canonical distribution factor.The chapter fits into a literature on drivers of arts consumption, which has shown that in addition to such factors as age, income and education, spousal preferences and characteristics are important in determining how much and which cultural goods are consumed. Gender differences in preferences in arts consumption have also been shown to be important and to persist after accounting for class, education and other socio-economic factors (Bihagen and Katz-Gerro, 2000).We explore to what extent this difference in preferences can be used to shed light on the decision process in couples’ households. Using three different distribution factors, we infer whether changes in the relative bargaining power of spouses induce changes in arts consumption.Using a large sample from the US Current Population Survey which includes data on the frequency of visits to various categories of cultural activities, we regress atten- dance rates on a range of socio-economic variables using a suitable count data model.We find that attendance by men at events such as the opera, ballet and other dance performances, which are more frequently attended by women than by men, show a significant influence of the distribution factors. This significant effect persists irrespec- tively of which distribution factor is used. We conclude that more influential men tend to participate in these activities less frequently than less influential men, conditionally on a host of controls notably including hours worked.The second chapter centers around the recovery of resource shares. This chapter is joint work with Denni Tommasi, a fellow PhD student at ECARES. It relies on the collective model of the household, which assumes simply that household decisions are Pareto-efficient. From this assumption, a relatively simple household problem can be formulated. Households can be seen as maximizers of weighted sums of their members’ utility functions. Importantly the weights, known as bargaining weights (or bargaining power), may depend on many factors, including prices. The household problem in turn implies structure for household demand, which is observed in survey data.Collective demand systems do not necessarily identify measures of bargaining power however. In fact, the ability to recover such a measure, and especially one that is useful for welfare analysis, was an important milestone in the literature. It was reached by (Browning et al. 2013) (henceforth BCL), with a collective model capable of identi- fying resource shares (also known as a sharing rule). These shares provide a measure of how resources are allocated in the household and so can be used to study intra- household consumption inequality. They also take into account that households gen- erate economies of scale for their members, a phenomenon known as a consumption technology: By sharing goods such as housing, members of households can generate savings that can be used elsewhere.Estimation of these resource shares involves expressing household budget shares functions of preferences, a consumption technology and a sharing rule, each of which is a function of observables, and letting the resulting system loose on the data. But obtaining such a demand system is not free. In addition to the usual empirical speci- fications of the various parts of the system, an identifying assumption has to be made to assure that resource shares can be recovered in estimation. In BCL, this is the assumption that singles and adult members of households share the same preferences. In Chapter 2, however, an alternative assumption is used.In a recent paper, Dunbar et al. (2013) (hereafter DLP) develop a collective model based on BCL that allows to identify resource shares using assumptions on the simi- larity of preferences within and between households. The model uses demand only for assignable goods, a favorite of household economists. These are goods such as mens’ clothing and womens’ clothing for which it is known who in a household consumes them. In this chapter, we show why, especially when the data exhibit relatively flat Engel curves, the model is weakly identified and induces high variability and an im- plausible pattern in least squares estimates.We propose an estimation strategy nested in their framework that greatly reduces this practical impediment to recovery of individual resource shares. To achieve this, we follow an empirical Bayes method that incorporates additional (or out-of-sample) information on singles and relies on mild assumptions on preferences. We show the practical usefulness of this strategy through a series of Monte Carlo simulations and by applying it to Mexican data.The results show that our approach is robust, gives a plausible picture of the house- hold decision process, and is particularly beneficial for the practitioner who wishes to apply the DLP framework. Our welfare analysis of the PROGRESA program in Mexico is the first to include separate poverty rates for men and women in a CCT program.The third Chapter addresses a problem similar to the one discussed in Chapter 2. The goal, again, is to estimate resource shares and to remedy issues of imprecision and instability in the demand systems that can deliver them. Here, the collective model used is based on Lewbel and Pendakur (2008), and uses data on the entire basket of goods that households consume. The identifying assumption is similar to that used by BCL, although I allow for some differences in preferences between singles and married individuals.I set out to improve the precision and stability of the resulting estimates, and so to make the model more useful for welfare analysis. In order to do so, this chapter approaches, for the first time, the estimation of a collective household demand system from a Bayesian perspective. Using prior information on equivalence scales, as well as restrictions implied by theory, tight credible intervals are found for resource shares, a measure of the distribution of economic well-being in a household. A modern MCMC sampling method provides a complete picture of the high-dimensional parameter vec- tor’s posterior distribution and allows for reliable inference.The share of household earnings generated by a household member is estimated to have a positive effect on her share of household resources in a sample of couples from the US Consumer Expenditure survey. An increase in the earnings share of one percentage point is estimated to result in a shift of between 0.05% and 0.14% of household resources in the same direction, meaning that spouses partially insure one another against such shifts. The estimates imply an expected shift of 0.71% of household resources from the average man to the average woman in the same sample between 2008 and 2012, when men lost jobs at a greater rate than women.Both Chapters 2 and 3 explore unconventional ways to achieve gains in estimator precision and reliability at relatively little cost. This represents a valuable contribution to a literature that, for all its merits in complexity and ingenious modeling, has not yet seriously endeavored to make itself empirically useful. / Doctorat en Sciences économiques et de gestion / info:eu-repo/semantics/nonPublished
|
242 |
Detecção de falhas em sistemas dinâmicos com redes bayesianas aprendidas a partir de estimação de estados.Jackson Paul Matsuura 07 March 2006 (has links)
A pronta detecção da ocorrência de falhas em sistemas dinâmicos é essencial na prevenção de condições de operação perigosas e mesmo de avaria física do sistema, o que colocaria em risco recursos valiosos, equipamento vital e vidas humanas. Os métodos convencionais de detecção de falhas, porém, esbarram em limitações de espaço físico, existência de um modelo matemático acurado do sistema e existência de dados sobre o comportando do sistema operando com falhas, entre outros. Nesse trabalho é proposto e avaliado um novo método de Detecção de Falhas em Sistemas Dinâmicos que apresenta vantagens tanto qualitativas quanto quantitativas sobre os métodos já reportados na literatura. O método proposto é fácil de ser entendido em alto nível, tem grande semelhança com a supervisão humana, não necessita de equipamento adicional, não necessita de um modelo acurado do sistema e não precisa de informação nenhuma sobre falhas anteriores no sistema; podendo ser aplicado a sistemas onde os outros métodos dificilmente apresentariam resultados satisfatórios. Nele uma rede Bayesiana é aprendida a partir de medidas do sistema operando normalmente sem falhas e essa rede é então usada na detecção de falhas, inferindo que desvios do comportamento probabilístico aprendido como normal são causados por falhas no sistema. Os resultados obtidos com o novo método, extremamente animadores, são comparados aos obtidos com a utilização de um método baseado em redundância analítica, mostrando-se bastante superior ao mesmo. Resultados adicionais obtidos no isolamento de falhas e na detecção de falhas de um sistema não-linear corroboram os excelentes resultados obtidos, apontando para um grande potencial de uso do método proposto.
|
243 |
Classificação de imagens de diversas fontes de informação com o uso de controladores de influência para as imagens e suas classes.Orlando Alves Máximo 19 December 2008 (has links)
Este trabalho aborda as técnicas de classificação supervisionada de imagens utilizando controladores de influência. Avaliou-se o desempenho do uso dos controladores de influência das imagens e também das classes presentes nas imagens. Para a determinação dos valores dos controladores de influência, foram propostos métodos para a estimativa dos controladores de influência das imagens e das suas classes. Dentre os métodos propostos, destacam-se os indicadores de separabilidade entre as classes da imagem e os provenientes do cálculo do coeficiente kappa e da Precisão Global da classificação. Apresentou-se, também, a proposta de um novo classificador que incorpora o conceito de controladores de influência através das probabilidades de ocorrência condicionais das classes presentes nas imagens. Para os testes de avaliação de desempenho do uso de controladores de influência, foram utilizados seis conjuntos de duas imagens SAR (originais e filtradas com filtros da média com janelas 3×3, 5×5 7×7, 9×9 e 11×11). O desempenho dos classificadores propostos mostrou-se superior aos Classificadores em Cascata, da Distância Euclidiana e da Distância de Mahalanobis, que não incorporam o conceito de controladores de influência em sua estrutura. Para os testes de desempenho do classificador baseado nas probabilidades de ocorrência condicional das classes, foram utilizados quatro conjuntos de imagens SAR simuladas. A análise dos resultados evidencia que o classificador proposto obteve desempenho superior ao Classificador em Cascata.
|
244 |
Estudo de técnicas em análise de dados de detectores de ondas gravitacionais.Helmo Alan Batista de Araújo 08 July 2008 (has links)
Neste trabalho inicialmente se investiga a possibilidade de utilização de uma inovadora transformada de tempo-frequência, conhecida como transformada S, para a análise de dados do detector de ondas gravitacionais ALLEGRO. Verifica-se que sua utilidade para este detector é limitada por causa da estreita largura de banda do mesmo. No entanto, argumenta-se que pode ser útil para detectores interferométricos. Em seguida, é apresentado um método robusto para a análise de dados baseado em um teste de hipótese conhecido como critério de Neyman-Pearson, para a determinação de eventos candidatos a sinais impulsivos. O método consiste na construção de funções de distribuição de probabilidade para a energia média ponderada dos blocos de dados gravados pelo detector, tanto na situação de ausência de sinal como para o caso de sinal misturado ao ruído. Com base nessas distribuições é possível encontrar a probabilidade do bloco de dados, (no qual um evento candidato é localizado), não coincidir com um bloco de ruído. Essa forma de busca por sinais candidatos imersos no ruído apresenta concordância com outro método utilizado para esse fim. Conclui-se que este é um método promissor, pois não é necessário passar por um processo mais refinado na busca por eventos candidatos, assim diminuindo o tempo de processamento computacional.
|
245 |
Bayesian Nonparametric Models for Multi-Stage Sample SurveysYin, Jiani 27 April 2016 (has links)
It is a standard practice in small area estimation (SAE) to use a model-based approach to borrow information from neighboring areas or from areas with similar characteristics. However, survey data tend to have gaps, ties and outliers, and parametric models may be problematic because statistical inference is sensitive to parametric assumptions. We propose nonparametric hierarchical Bayesian models for multi-stage finite population sampling to robustify the inference and allow for heterogeneity, outliers, skewness, etc. Bayesian predictive inference for SAE is studied by embedding a parametric model in a nonparametric model. The Dirichlet process (DP) has attractive properties such as clustering that permits borrowing information. We exemplify by considering in detail two-stage and three-stage hierarchical Bayesian models with DPs at various stages. The computational difficulties of the predictive inference when the population size is much larger than the sample size can be overcome by the stick-breaking algorithm and approximate methods. Moreover, the model comparison is conducted by computing log pseudo marginal likelihood and Bayes factors. We illustrate the methodology using body mass index (BMI) data from the National Health and Nutrition Examination Survey and simulated data. We conclude that a nonparametric model should be used unless there is a strong belief in the specific parametric form of a model.
|
246 |
Medical data mining using Bayesian network and DNA sequence analysis.January 2004 (has links)
Lee Kit Ying. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2004. / Includes bibliographical references (leaves 115-117). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Project Background --- p.1 / Chapter 1.2 --- Problem Specifications --- p.3 / Chapter 1.3 --- Contributions --- p.5 / Chapter 1.4 --- Thesis Organization --- p.6 / Chapter 2 --- Background --- p.8 / Chapter 2.1 --- Medical Data Mining --- p.8 / Chapter 2.1.1 --- General Information --- p.9 / Chapter 2.1.2 --- Related Research --- p.10 / Chapter 2.1.3 --- Characteristics and Difficulties Encountered --- p.11 / Chapter 2.2 --- DNA Sequence Analysis --- p.13 / Chapter 2.3 --- Hepatitis B Virus --- p.14 / Chapter 2.3.1 --- Virus Characteristics --- p.15 / Chapter 2.3.2 --- Important Findings on the Virus --- p.17 / Chapter 2.4 --- Bayesian Network and its Classifiers --- p.17 / Chapter 2.4.1 --- Formal Definition --- p.18 / Chapter 2.4.2 --- Existing Learning Algorithms --- p.19 / Chapter 2.4.3 --- Evolutionary Algorithms and Hybrid EP (HEP) --- p.22 / Chapter 2.4.4 --- Bayesian Network Classifiers --- p.25 / Chapter 2.4.5 --- Learning Algorithms for BN Classifiers --- p.32 / Chapter 3 --- Bayesian Network Classifier for Clinical Data --- p.35 / Chapter 3.1 --- Related Work --- p.36 / Chapter 3.2 --- Proposed BN-augmented Naive Bayes Classifier (BAN) --- p.38 / Chapter 3.2.1 --- Definition --- p.38 / Chapter 3.2.2 --- Learning Algorithm with HEP --- p.39 / Chapter 3.2.3 --- Modifications on HEP --- p.39 / Chapter 3.3 --- Proposed General Bayesian Network with Markov Blan- ket (GBN) --- p.40 / Chapter 3.3.1 --- Definition --- p.41 / Chapter 3.3.2 --- Learning Algorithm with HEP --- p.41 / Chapter 3.4 --- Findings on Bayesian Network Parameters Calculation --- p.43 / Chapter 3.4.1 --- Situation and Errors --- p.43 / Chapter 3.4.2 --- Proposed Solution --- p.46 / Chapter 3.5 --- Performance Analysis on Proposed BN Classifier Learn- ing Algorithms --- p.47 / Chapter 3.5.1 --- Experimental Methodology --- p.47 / Chapter 3.5.2 --- Benchmark Data --- p.48 / Chapter 3.5.3 --- Clinical Data --- p.50 / Chapter 3.5.4 --- Discussion --- p.55 / Chapter 3.6 --- Summary --- p.56 / Chapter 4 --- Classification in DNA Analysis --- p.57 / Chapter 4.1 --- Related Work --- p.58 / Chapter 4.2 --- Problem Definition --- p.59 / Chapter 4.3 --- Proposed Methodology Architecture --- p.60 / Chapter 4.3.1 --- Overall Design --- p.60 / Chapter 4.3.2 --- Important Components --- p.62 / Chapter 4.4 --- Clustering --- p.63 / Chapter 4.5 --- Feature Selection Algorithms --- p.65 / Chapter 4.5.1 --- Information Gain --- p.66 / Chapter 4.5.2 --- Other Approaches --- p.67 / Chapter 4.6 --- Classification Algorithms --- p.67 / Chapter 4.6.1 --- Naive Bayes Classifier --- p.68 / Chapter 4.6.2 --- Decision Tree --- p.68 / Chapter 4.6.3 --- Neural Networks --- p.68 / Chapter 4.6.4 --- Other Approaches --- p.69 / Chapter 4.7 --- Important Points on Evaluation --- p.69 / Chapter 4.7.1 --- Errors --- p.70 / Chapter 4.7.2 --- Independent Test --- p.70 / Chapter 4.8 --- Performance Analysis on Classification of DNA Data --- p.71 / Chapter 4.8.1 --- Experimental Methodology --- p.71 / Chapter 4.8.2 --- Using Naive-Bayes Classifier --- p.73 / Chapter 4.8.3 --- Using Decision Tree --- p.73 / Chapter 4.8.4 --- Using Neural Network --- p.74 / Chapter 4.8.5 --- Discussion --- p.76 / Chapter 4.9 --- Summary --- p.77 / Chapter 5 --- Adaptive HEP for Learning Bayesian Network Struc- ture --- p.78 / Chapter 5.1 --- Background --- p.79 / Chapter 5.1.1 --- Objective --- p.79 / Chapter 5.1.2 --- Related Work - AEGA --- p.79 / Chapter 5.2 --- Feasibility Study --- p.80 / Chapter 5.3 --- Proposed A-HEP Algorithm --- p.82 / Chapter 5.3.1 --- Structural Dissimilarity Comparison --- p.82 / Chapter 5.3.2 --- Dynamic Population Size --- p.83 / Chapter 5.4 --- Evaluation on Proposed Algorithm --- p.88 / Chapter 5.4.1 --- Experimental Methodology --- p.89 / Chapter 5.4.2 --- Comparison on Running Time --- p.93 / Chapter 5.4.3 --- Comparison on Fitness of Final Network --- p.94 / Chapter 5.4.4 --- Comparison on Similarity to the Original Network --- p.95 / Chapter 5.4.5 --- Parameter Study --- p.96 / Chapter 5.5 --- Applications on Medical Domain --- p.100 / Chapter 5.5.1 --- Discussion --- p.100 / Chapter 5.5.2 --- An Example --- p.101 / Chapter 5.6 --- Summary --- p.105 / Chapter 6 --- Conclusion --- p.107 / Chapter 6.1 --- Summary --- p.107 / Chapter 6.2 --- Future Work --- p.109 / Bibliography --- p.117
|
247 |
Categorização hierárquica de textos em um portal agregador de notíciasBorges, Hugo Lima January 2009 (has links)
Orientadora: Ana Carolina Lorena / Dissertação (mestrado) - Universidade Federal do ABC. Programa de Pós-Graduação em Engenharia da Informação, 2009
|
248 |
Uma proposta lingüística para a edução dos parâmetros de redes Bayesianas-Fuzzy na estimação da probabilidade de erro humanoSALES FILHO, Romero Luiz Mendonça 31 January 2008 (has links)
Made available in DSpace on 2014-06-12T17:38:01Z (GMT). No. of bitstreams: 2
arquivo3964_1.pdf: 949658 bytes, checksum: b644e3df914d1f9c2018ffd9d301379b (MD5)
license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5)
Previous issue date: 2008 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Uma grande escassez de dados é notada quando se está trabalhando em uma análise
probabilística de riscos (APR). Alguns métodos são propostos na literatura como forma de
contornar esse grande problema. Tais métodos são chamados de métodos de edução da
opinião do especialista. Nesses métodos o analista recorre a especialistas que têm grande
conhecimento sobre o problema analisado. Os especialistas, por sua vez, fornecem opiniões
sobre o parâmetro investigado e com isso o analista consegue obter uma estimativa sobre o
valor desconhecido. Neste trabalho será proposto um método de edução capaz de trabalhar
com variáveis lingüísticas, de forma que ao final do processo possa ser obtida uma estimativa
fuzzy sobre determinado parâmetro. Especificamente neste trabalho a idéia é obter estimativas
fuzzy sobre probabilidades condicionais as quais serão utilizadas em uma rede bayesiana-fuzzy
para a estimação da probabilidade de erro humano. Um exemplo de aplicação envolvendo um
eletricista auxiliar presente na atividade de substituição de cadeias de isoladores em linhas de
transmissão é discutido ao final do trabalho
|
249 |
Uma abordagem bayesiana para mapeamento de QTLs em populações experimentais / A Bayesian approach for mapping QTL in experimental populationsAndréia da Silva Meyer 03 April 2009 (has links)
Muitos caracteres em plantas e animais são de natureza quantitativa, influenciados por múltiplos genes. Com o advento de novas técnicas moleculares tem sido possível mapear os locos que controlam os caracteres quantitativos, denominados QTLs (Quantitative Trait Loci). Mapear um QTL significa identificar sua posição no genoma, bem como, estimar seus efeitos genéticos. A maior dificuldade para realizar o mapeamento de QTLs, se deve ao fato de que o número de QTLs é desconhecido. Métodos bayesianos juntamente com método Monte Carlo com Cadeias de Markov (MCMC), têm sido implementados para inferir conjuntamente o número de QTLs, suas posições no genoma e os efeitos genéticos . O desafio está em obter a amostra da distribuição conjunta a posteriori desses parâmetros, uma vez que o número de QTLs pode ser considerado desconhecido e a dimensão do espaço paramétrico muda de acordo com o número de QTLs presente no modelo. No presente trabalho foi implementado, utilizando-se o programa estatístico R uma abordagem bayesiana para mapear QTLs em que múltiplos QTLs e os efeitos de epistasia são considerados no modelo. Para tanto foram ajustados modelos com números crescentes de QTLs e o fator de Bayes foi utilizado para selecionar o modelo mais adequado e conseqüentemente, estimar o número de QTLs que controlam os fenótipos de interesse. Para investigar a eficiência da metodologia implementada foi feito um estudo de simulação em que foram considerados duas diferentes populações experimentais: retrocruzamento e F2, sendo que para ambas as populações foi feito o estudo de simulação considerando modelos com e sem epistasia. A abordagem implementada mostrou-se muito eficiente, sendo que para todas as situações consideradas o modelo selecionado foi o modelo contendo o número verdadeiro de QTLs considerado na simulação dos dados. Além disso, foi feito o mapeamento de QTLs de três fenótipos de milho tropical: altura da planta (AP), altura da espiga (AE) e produção de grãos utilizando a metodologia implementada e os resultados obtidos foram comparados com os resultados encontrados pelo método CIM. / Many traits in plants and animals have quantitative nature, influenced by multiple genes. With the new molecular techniques, it has been possible to map the loci, which control the quantitative traits, called QTL (Quantitative Trait Loci). Mapping a QTL means to identify its position in the genome, as well as to estimate its genetics effects. The great difficulty of mapping QTL relates to the fact that the number of QTL is unknown. Bayesian approaches used with Markov Chain Monte Carlo method (MCMC) have been applied to infer QTL number, their positions in the genome and their genetic effects. The challenge is to obtain the sample from the joined distribution posterior of these parameters, since the number of QTL may be considered unknown and hence the dimension of the parametric space changes according to the number of QTL in the model. In this study, a Bayesian approach was applied, using the statistical program R, in order to map QTL, considering multiples QTL and epistasis effects in the model. Models were adjusted with the crescent number of QTL and Bayes factor was used to select the most suitable model and, consequently, to estimate the number of QTL that control interesting phenotype. To evaluate the efficiency of the applied methodology, a simulation study was done, considering two different experimental populations: backcross and F2, accomplishing the simulation study for both populations, considering models with and without epistasis. The applied approach resulted to be very efficient, considering that for all the used situations, the selected model was the one containing the real number of QTL used in the data simulation. Moreover, the QTL mapping of three phenotypes of tropical corn was done: plant height, corn-cob height and grain production, using the applied methodology and the results were compared to the results found by the CIM method.
|
250 |
Sparse Gaussian process approximations and applicationsvan der Wilk, Mark January 2019 (has links)
Many tasks in machine learning require learning some kind of input-output relation (function), for example, recognising handwritten digits (from image to number) or learning the motion behaviour of a dynamical system like a pendulum (from positions and velocities now to future positions and velocities). We consider this problem using the Bayesian framework, where we use probability distributions to represent the state of uncertainty that a learning agent is in. In particular, we will investigate methods which use Gaussian processes to represent distributions over functions. Gaussian process models require approximations in order to be practically useful. This thesis focuses on understanding existing approximations and investigating new ones tailored to specific applications. We advance the understanding of existing techniques first through a thorough review. We propose desiderata for non-parametric basis function model approximations, which we use to assess the existing approximations. Following this, we perform an in-depth empirical investigation of two popular approximations (VFE and FITC). Based on the insights gained, we propose a new inter-domain Gaussian process approximation, which can be used to increase the sparsity of the approximation, in comparison to regular inducing point approximations. This allows GP models to be stored and communicated more compactly. Next, we show that inter-domain approximations can also allow the use of models which would otherwise be impractical, as opposed to improving existing approximations. We introduce an inter-domain approximation for the Convolutional Gaussian process - a model that makes Gaussian processes suitable to image inputs, and which has strong relations to convolutional neural networks. This same technique is valuable for approximating Gaussian processes with more general invariance properties. Finally, we revisit the derivation of the Gaussian process State Space Model, and discuss some subtleties relating to their approximation. We hope that this thesis illustrates some benefits of non-parametric models and their approximation in a non-parametric fashion, and that it provides models and approximations that prove to be useful for the development of more complex and performant models in the future.
|
Page generated in 0.0568 seconds