• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2043
  • 601
  • 261
  • 260
  • 61
  • 32
  • 26
  • 19
  • 15
  • 14
  • 10
  • 8
  • 6
  • 6
  • 5
  • Tagged with
  • 4139
  • 811
  • 760
  • 730
  • 720
  • 719
  • 711
  • 660
  • 576
  • 450
  • 432
  • 416
  • 408
  • 369
  • 314
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
411

Bayesian Hierarchical Model for Combining Two-resolution Metrology Data

Xia, Haifeng 14 January 2010 (has links)
This dissertation presents a Bayesian hierarchical model to combine two-resolution metrology data for inspecting the geometric quality of manufactured parts. The high- resolution data points are scarce, and thus scatter over the surface being measured, while the low-resolution data are pervasive, but less accurate or less precise. Combining the two datasets could supposedly make a better prediction of the geometric surface of a manufactured part than using a single dataset. One challenge in combining the metrology datasets is the misalignment which exists between the low- and high-resolution data points. This dissertation attempts to provide a Bayesian hierarchical model that can handle such misaligned datasets, and includes the following components: (a) a Gaussian process for modeling metrology data at the low-resolution level; (b) a heuristic matching and alignment method that produces a pool of candidate matches and transformations between the two datasets; (c) a linkage model, conditioned on a given match and its associated transformation, that connects a high-resolution data point to a set of low-resolution data points in its neighborhood and makes a combined prediction; and finally (d) Bayesian model averaging of the predictive models in (c) over the pool of candidate matches found in (b). This Bayesian model averaging procedure assigns weights to different matches according to how much they support the observed data, and then produces the final combined prediction of the surface based on the data of both resolutions. The proposed method improves upon the methods of using a single dataset as well as a combined prediction without addressing the misalignment problem. This dissertation demonstrates the improvements over alternative methods using both simulated data and the datasets from a milled sine-wave part, measured by two coordinate measuring machines of different resolutions, respectively.
412

Bayesian methods in determining health burdens

Metcalfe, Leanne N. 20 August 2008 (has links)
There has been an almost 60 percent increase in health care expenditures in the US in the past seven years. Employer-sponsored health coverage premiums have increased significantly (87 percent) in this same period. Besides the cost of care for chronic conditions such as migraine, arthritis and diabetes, absenteeism linked to these diseases also adds financial strain. Current health financial models focus on past spending instead of modeling based on current health burdens and future trends. This approach leads to suboptimal health maintenance and cost management. Identifying the diseases which affect the most employees and are also the most costly (in terms of productivity, work-loss-days, treatment etc) is necessary, since this allows the employer to identify which combination of policies may best address the health burdens. The current predictive health model limits the amount of diseases it models since it ignores incomplete data sets. This research investigated if by using Bayesian methodology it will be possible to create a comprehensive predictive model of the health burdens being faced by corporations, allowing for health decision makers to have comprehensive information when choosing policies. The first specific aim was to identify which diseases were the most costly to employers both directly and indirectly, and the pathogenesis of these diseases. Co-morbidity of diseases was also taken into account as in many cases these diseases are not treated independently. This information was taken into account when designing the models as the inference was disease specific. One of the contributions of this thesis is coherent incorporation of prior information into the proposed expert model. The Bayesian models were able to estimate the predicted disease burdens for corporations, including predicting the percentage of individuals with multiple diseases. The model was also comparable to, or better than current estimators on the market with limited input. The outputs of the model were also able to give further insight into the disease interactions which creates an avenue for further research in disease management.
413

Bayesian framework for improved R&D decisions

Anand, Farminder Singh 25 March 2010 (has links)
This thesis work describes the formulation of a Bayesian approach along with new tools to systematically reduce uncertainty in Research&Development (R&D) alternatives. During the initial stages of R&D many alternatives are considered and high uncertainty exists for all the alternatives. The ideal approach in addressing the many R&D alternatives is to find the one alternative which is stochastically dominant i.e. the alternative which is better in all possible scenarios of uncertainty. Often a stochastically dominant alternative does not exist. This leaves the R&D manager with two alternatives, either to make a selection based on user defined utility function or to gather more information in order to reduce uncertainty in the various alternatives. From the decision makers perspective the second alternative has more intrinsic value, since reduction of uncertainty will improve the confidence in the selection and further reduce the high downside risk involved with the decisions made under high uncertainty. The motivation for this work is derived from our preliminary work on the evaluation of biorefiney alternatives, which brought into limelight the key challenges and opportunities in the evaluation of R&D alternatives. The primary challenge in the evaluation of many R&D alternatives was the presence of uncertainty in the many unit operations within each and every alternative. Additionally, limited or non-existent experimental data made it infeasible to quantify the uncertainty and lead to inability to develop an even simple systematic strategy to reduce it. Moreover, even if the uncertainty could be quantified, the traditional approaches (scenario analysis or stochastic analysis), lacked the ability to evaluate the key group of uncertainty contributors. Lastly, the traditional design of experiment approaches focus towards reduction in uncertainty in the parameter estimates of the model, whereas what is required is a design of experiment approach which focuses on the decision (selection of the key alternative). In order to tackle all the above mentioned challenges a Bayesian framework along with two new tools is proposed. The Bayesian framework consists of three main steps: a. Quantification of uncertainty b. Evaluation of key uncertainty contributors c. Design of experiment strategies, focussed on decision making rather than the traditional parameter uncertainty reduction To quantify technical uncertainty using expert knowledge, existing elicitation methods in the literature (outside chemical engineering domain) are used. To illustrate the importance of quantifying technical uncertainty, a bio-refinery case study is considered. The case study is an alternative for producing ethanol as a value added product in a Kraft mill producing pulp from softwood. To produce ethanol, a hot water pre-extraction of hemi-cellulose is considered, prior to the pulping stage. Using this case study, the methodology to quantify technical uncertainty using experts' knowledge is demonstrated. To limit the cost of R&D investment for selection or rejection of an R&D alternative, it is essential to evaluate the key uncertainty contributors. Global sensitivity analysis (GSA) is a tool which can be used to evaluate the key uncertainties. But quite often global sensitivity analysis fails to differentiate between the uncertainties and assigns them equal global sensitivity index. To counter this failing of GSA, a new method conditional global sensitivity (c-GSA) is presented, which is able to differentiate between the uncertainties even when GSA fails to do so. To demonstrate the value of c-GSA many small examples are presented. The third and the last key method in the Bayesian framework is the decision oriented design of experiment. Traditional 'Design of Experiment' (DOE) approaches focus on minimization of parameter error variance. In this work, a new "decision-oriented" DOE approach is proposed that takes into account how the generated data, and subsequently, the model developed based on them will be used in decision making. By doing so, the parameter variances get distributed in a manner such that its adverse impact on the targeted decision making is minimal. Results show that the new decision-oriented DOE approach significantly outperforms the standard D-optimal design approach. The new design method should be a valuable tool when experiments are conducted for the purpose of making R&D decisions. Finally, to demonstrate the importance of the overall Bayesian framework a bio-refinery case study is considered. The case study consists of the alternative to introduce a hemi-cellulose pre-extraction stage prior to pulping in a thermo-mechanical pulp mill. Application of the Bayesian framework to address this alternative, results in significant improvement in the prediction of the true potential value of the alternative.
414

A Bayesian approach to fault isolation with application to diesel engine diagnosis

Pernestål, Anna January 2007 (has links)
<p>Users of heavy trucks, as well as legislation, put increasing demands on heavy trucks. The vehicles should be more comfortable, reliable and safe. Furthermore, they should consume less fuel and be more environmentally friendly. For example, this means that faults that cause the emissions to increase must be detected early. To meet these requirements on comfort and performance, advanced sensor-based computer control-systems are used. However, the increased complexity makes the vehicles more difficult for the workshop mechanic to maintain and repair. A diagnosis system that detects and localizes faults is thus needed, both as an aid in the repair process and for detecting and isolating (localizing) faults on-board, to guarantee that safety and environmental goals are satisfied.</p><p>Reliable fault isolation is often a challenging task. Noise, disturbances and model errors can cause problems. Also, two different faults may lead to the same observed behavior of the system under diagnosis. This means that there are several faults, which could possibly explain the observed behavior of the vehicle.</p><p>In this thesis, a Bayesian approach to fault isolation is proposed. The idea is to compute the probabilities, given ``all information at hand'', that certain faults are present in the system under diagnosis. By ``all information at hand'' we mean qualitative and quantitative information about how probable different faults are, and possibly also data which is collected during test drives with the vehicle when faults are present. The information may also include knowledge about which observed behavior that is to be expected when certain faults are present.</p><p>The advantage of the Bayesian approach is the possibility to combine information of different characteristics, and also to facilitate isolation of previously unknown faults as well as faults from which only vague information is available. Furthermore, Bayesian probability theory combined with decision theory provide methods for determining the best action to perform to reduce the effects from faults.</p><p>Using the Bayesian approach to fault isolation to diagnose large and complex systems may lead to computational and complexity problems. In this thesis, these problems are solved in three different ways. First, equivalence classes are introduced for different faults with equal probability distributions. Second, by using the structure of the computations, efficient storage methods can be used. Finally, if the previous two simplifications are not sufficient, it is shown how the problem can be approximated by partitioning it into a set of sub problems, which each can be efficiently solved using the presented methods.</p><p>The Bayesian approach to fault isolation is applied to the diagnosis of the gas flow of an automotive diesel engine. Data collected from real driving situations with implemented faults, is used in the evaluation of the methods. Furthermore, the influences of important design parameters are investigated.</p><p>The experiments show that the proposed Bayesian approach has promising potentials for vehicle diagnosis, and performs well on this real problem. Compared with more classical methods, e.g. structured residuals, the Bayesian approach used here gives higher probability of detection and isolation of the true underlying fault.</p> / <p>Både användare och lagstiftare ställer idag ökande krav på prestanda hos tunga lastbilar. Fordonen ska var bekväma, tillförlitliga och säkra. Dessutom ska de ha bättre bränsleekonomi vara mer miljövänliga. Detta betyder till exempel att fel som orsakar förhöjda emissioner måste upptäckas i ett tidigt stadium.</p><p>För att möta dessa krav på komfort och prestanda används avancerade sensorbaserade reglersystem.</p><p>Emellertid leder den ökade komplexiteten till att fordonen blir mer komplicerade för en mekaniker att underhålla, felsöka och reparera.</p><p>Därför krävs det ett diagnossystem som detekterar och lokaliserar felen, både som ett hjälpmedel i reparationsprocessen, och för att kunna detektera och lokalisera (isolera) felen ombord för att garantera att säkerhetskrav och miljömål är uppfyllda.</p><p>Tillförlitlig felisolering är ofta en utmanande uppgift. Brus, störningar och modellfel kan orsaka problem. Det kan också det faktum två olika fel kan leda till samma observerade beteende hos systemet som diagnosticeras. Detta betyder att det finns flera fel som möjligen skulle kunna förklara det observerade beteendet hos fordonet.</p><p>I den här avhandlingen föreslås användandet av en Bayesianska ansats till felisolering. I metoden beräknas sannolikheten för att ett visst fel är närvarande i det diagnosticerade systemet, givet ''all tillgänglig information''. Med ''all tillgänglig information'' menas både kvalitativ och kvantitativ information om hur troliga fel är och möjligen även data som samlats in under testkörningar med fordonet, då olika fel finns närvarande. Informationen kan även innehålla kunskap om vilket beteende som kan förväntas observeras då ett särskilt fel finns närvarande.</p><p>Fördelarna med den Bayesianska metoden är möjligheten att kombinera information av olika karaktär, men också att att den möjliggör isolering av tidigare okända fel och fel från vilka det endast finns vag information tillgänglig. Vidare kan Bayesiansk sannolikhetslära kombineras med beslutsteori för att erhålla metoder för att bestämma nästa bästa åtgärd för att minska effekten från fel.</p><p>Användandet av den Bayesianska metoden kan leda till beräknings- och komplexitetsproblem. I den här avhandlingen hanteras dessa problem på tre olika sätt. För det första så introduceras ekvivalensklasser för fel med likadana sannolikhetsfördelningar. För det andra, genom att använda strukturen på beräkningarna kan effektiva lagringsmetoder användas. Slutligen, om de två tidigare förenklingarna inte är tillräckliga, visas det hur problemet kan approximeras med ett antal delproblem, som vart och ett kan lösas effektivt med de presenterade metoderna.</p><p>Den Bayesianska ansatsen till felisolering har applicerats på diagnosen av gasflödet på en dieselmotor. Data som har samlats från riktiga körsituationer med fel implementerade används i evalueringen av metoderna. Vidare har påverkan av viktiga parametrar på isoleringsprestandan undersökts.</p><p>Experimenten visar att den föreslagna Bayesianska ansatsen har god potential för fordonsdiagnos, och prestandan är bra på detta reella problem. Jämfört med mer klassiska metoder baserade på strukturerade residualer ger den Bayesianska metoden högre sannolikhet för detektion och isolering av det sanna, underliggande, felet.</p>
415

Function-on-Function Regression with Public Health Applications

Meyer, Mark John 06 June 2014 (has links)
Medical research currently involves the collection of large and complex data. One such type of data is functional data where the unit of measurement is a curve measured over a grid. Functional data comes in a variety of forms depending on the nature of the research. Novel methodologies are required to accommodate this growing volume of functional data alongside new testing procedures to provide valid inferences. In this dissertation, I propose three novel methods to accommodate a variety of questions involving functional data of multiple forms. I consider three novel methods: (1) a function-on-function regression for Gaussian data; (2) a historical functional linear models for repeated measures; and (3) a generalized functional outcome regression for ordinal data. For each method, I discuss the existing shortcomings of the literature and demonstrate how my method fills those gaps. The abilities of each method are demonstrated via simulation and data application.
416

Bayesian Hierarchical Models for Model Choice

Li, Yingbo January 2013 (has links)
<p>With the development of modern data collection approaches, researchers may collect hundreds to millions of variables, yet may not need to utilize all explanatory variables available in predictive models. Hence, choosing models that consist of a subset of variables often becomes a crucial step. In linear regression, variable selection not only reduces model complexity, but also prevents over-fitting. From a Bayesian perspective, prior specification of model parameters plays an important role in model selection as well as parameter estimation, and often prevents over-fitting through shrinkage and model averaging.</p><p>We develop two novel hierarchical priors for selection and model averaging, for Generalized Linear Models (GLMs) and normal linear regression, respectively. They can be considered as "spike-and-slab" prior distributions or more appropriately "spike- and-bell" distributions. Under these priors we achieve dimension reduction, since their point masses at zero allow predictors to be excluded with positive posterior probability. In addition, these hierarchical priors have heavy tails to provide robust- ness when MLE's are far from zero.</p><p>Zellner's g-prior is widely used in linear models. It preserves correlation structure among predictors in its prior covariance, and yields closed-form marginal likelihoods which leads to huge computational savings by avoiding sampling in the parameter space. Mixtures of g-priors avoid fixing g in advance, and can resolve consistency problems that arise with fixed g. For GLMs, we show that the mixture of g-priors using a Compound Confluent Hypergeometric distribution unifies existing choices in the literature and maintains their good properties such as tractable (approximate) marginal likelihoods and asymptotic consistency for model selection and parameter estimation under specific values of the hyper parameters.</p><p>While the g-prior is invariant under rotation within a model, a potential problem with the g-prior is that it inherits the instability of ordinary least squares (OLS) estimates when predictors are highly correlated. We build a hierarchical prior based on scale mixtures of independent normals, which incorporates invariance under rotations within models like ridge regression and the g-prior, but has heavy tails like the Zeller-Siow Cauchy prior. We find this method out-performs the gold standard mixture of g-priors and other methods in the case of highly correlated predictors in Gaussian linear models. We incorporate a non-parametric structure, the Dirichlet Process (DP) as a hyper prior, to allow more flexibility and adaptivity to the data.</p> / Dissertation
417

Some Recent Advances in Non- and Semiparametric Bayesian Modeling with Copulas, Mixtures, and Latent Variables

Murray, Jared January 2013 (has links)
<p>This thesis develops flexible non- and semiparametric Bayesian models for mixed continuous, ordered and unordered categorical data. These methods have a range of possible applications; the applications considered in this thesis are drawn primarily from the social sciences, where multivariate, heterogeneous datasets with complex dependence and missing observations are the norm. </p><p>The first contribution is an extension of the Gaussian factor model to Gaussian copula factor models, which accommodate continuous and ordinal data with unspecified marginal distributions. I describe how this model is the most natural extension of the Gaussian factor model, preserving its essential dependence structure and the interpretability of factor loadings and the latent variables. I adopt an approximate likelihood for posterior inference and prove that, if the Gaussian copula model is true, the approximate posterior distribution of the copula correlation matrix asymptotically converges to the correct parameter under nearly any marginal distributions. I demonstrate with simulations that this method is both robust and efficient, and illustrate its use in an application from political science.</p><p>The second contribution is a novel nonparametric hierarchical mixture model for continuous, ordered and unordered categorical data. The model includes a hierarchical prior used to couple component indices of two separate models, which are also linked by local multivariate regressions. This structure effectively overcomes the limitations of existing mixture models for mixed data, namely the overly strong local independence assumptions. In the proposed model local independence is replaced by local conditional independence, so that the induced model is able to more readily adapt to structure in the data. I demonstrate the utility of this model as a default engine for multiple imputation of mixed data in a large repeated-sampling study using data from the Survey of Income and Participation. I show that it improves substantially on its most popular competitor, multiple imputation by chained equations (MICE), while enjoying certain theoretical properties that MICE lacks. </p><p>The third contribution is a latent variable model for density regression. Most existing density regression models are quite flexible but somewhat cumbersome to specify and fit, particularly when the regressors are a combination of continuous and categorical variables. The majority of these methods rely on extensions of infinite discrete mixture models to incorporate covariate dependence in mixture weights, atoms or both. I take a fundamentally different approach, introducing a continuous latent variable which depends on covariates through a parametric regression. In turn, the observed response depends on the latent variable through an unknown function. I demonstrate that a spline prior for the unknown function is quite effective relative to Dirichlet Process mixture models in density estimation settings (i.e., without covariates) even though these Dirichlet process mixtures have better theoretical properties asymptotically. The spline formulation enjoys a number of computational advantages over more flexible priors on functions. Finally, I demonstrate the utility of this model in regression applications using a dataset on U.S. wages from the Census Bureau, where I estimate the return to schooling as a smooth function of the quantile index.</p> / Dissertation
418

Inner Ensembles: Using Ensemble Methods in Learning Step

Abbasian, Houman 16 May 2014 (has links)
A pivotal moment in machine learning research was the creation of an important new research area, known as Ensemble Learning. In this work, we argue that ensembles are a very general concept, and though they have been widely used, they can be applied in more situations than they have been to date. Rather than using them only to combine the output of an algorithm, we can apply them to decisions made inside the algorithm itself, during the learning step. We call this approach Inner Ensembles. The motivation to develop Inner Ensembles was the opportunity to produce models with the similar advantages as regular ensembles, accuracy and stability for example, plus additional advantages such as comprehensibility, simplicity, rapid classification and small memory footprint. The main contribution of this work is to demonstrate how broadly this idea can be applied, and highlight its potential impact on all types of algorithms. To support our claim, we first provide a general guideline for applying Inner Ensembles to different algorithms. Then, using this framework, we apply them to two categories of learning methods: supervised and un-supervised. For the former we chose Bayesian network, and for the latter K-Means clustering. Our results show that 1) the overall performance of Inner Ensembles is significantly better than the original methods, and 2) Inner Ensembles provide similar performance improvements as regular ensembles.
419

Tumor Gene Expression Purification Using Infinite Mixture Topic Models

Deshwar, Amit Gulab 11 July 2013 (has links)
There is significant interest in using gene expression measurements to aid in the personalization of medical treatment. The presence of significant normal tissue contamination in tumor samples makes it difficult to use tumor expression measurements to predict clinical variables and treatment response. I present a probabilistic method, TMMpure, to infer the expression profile of the cancerous tissue using a modified topic model that contains a hierarchical Dirichlet process prior on the cancer profiles. I demonstrate that TMMpure is able to infer the expression profile of cancerous tissue and improves the power of predictive models for clinical variables using expression profiles.
420

Tumor Gene Expression Purification Using Infinite Mixture Topic Models

Deshwar, Amit Gulab 11 July 2013 (has links)
There is significant interest in using gene expression measurements to aid in the personalization of medical treatment. The presence of significant normal tissue contamination in tumor samples makes it difficult to use tumor expression measurements to predict clinical variables and treatment response. I present a probabilistic method, TMMpure, to infer the expression profile of the cancerous tissue using a modified topic model that contains a hierarchical Dirichlet process prior on the cancer profiles. I demonstrate that TMMpure is able to infer the expression profile of cancerous tissue and improves the power of predictive models for clinical variables using expression profiles.

Page generated in 0.0293 seconds