• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2047
  • 601
  • 262
  • 260
  • 61
  • 32
  • 26
  • 19
  • 15
  • 14
  • 10
  • 8
  • 6
  • 6
  • 5
  • Tagged with
  • 4146
  • 815
  • 761
  • 732
  • 723
  • 722
  • 714
  • 661
  • 580
  • 451
  • 433
  • 416
  • 412
  • 370
  • 315
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
881

Mapping and modeling British Columbia's food self-sufficiency

Morrison, Kathryn 10 June 2011 (has links)
Interest in local food security has increased in the last decade, stemming from concerns surrounding environmental sustainability, agriculture, and community food security. Promotions for consumption of locally produced foods have come from activists, non-governmental organizations, as well as some academic and government research and policy. The goal of this thesis is to develop, map, and model an index of self-sufficiency in the province of British Columbia. To meet this goal, I develop estimates for food production at the local scale by integrating federally gathered agricultural land use and yield data from the Agricultural Census and various surveys. Second, I construct population-level food consumption estimates based on provincial nutrition survey and regional demographics. Third, I construct a self-sufficiency index for each Local Health Area in the province, and develop a predictive model in a Bayesian autoregressive framework. I find that local scale comparable estimates of food production and food consumption can be constructed through data integration, and both datasets exhibit considerable spatial variability throughout the province. The predictive model allows for estimation of regional scale self-sufficiency without reliance on land use or nutrition data and stabilizes mapping of our raw index through neighborhood-based spatial smoothing. The methods developed will be a useful tool for researchers and government officials interested in agriculture, nutrition, and food security, as well as a first step towards more advanced modeling of current local food self-sufficiency and future potential. / Graduate
882

A Bayesian Group Sparse Multi-Task Regression Model for Imaging Genomics

Greenlaw, Keelin 26 August 2015 (has links)
Recent advances in technology for brain imaging and high-throughput genotyping have motivated studies examining the influence of genetic variation on brain structure. In this setting, high-dimensional regression for multi-SNP association analysis is challenging as the brain imaging phenotypes are multivariate and there is a desire to incorporate a biological group structure among SNPs based on their belonging genes. Wang et al. (Bioinformatics, 2012) have recently developed an approach for simultaneous estimation and SNP selection based on penalized regression with regularization based on a novel group l_{2,1}-norm penalty, which encourages sparsity at the gene level. A problem with the proposed approach is that it only provides a point estimate. We solve this problem by developing a corresponding Bayesian formulation based on a three-level hierarchical model that allows for full posterior inference using Gibbs sampling. For the selection of tuning parameters, we consider techniques based on: (i) a fully Bayes approach with hyperpriors, (ii) empirical Bayes with implementation based on a Monte Carlo EM algorithm, and (iii) cross-validation (CV). When the number of SNPs is greater than the number of observations we find that both the fully Bayes and empirical Bayes approaches overestimate the tuning parameters, leading to overshrinkage of regression coefficients. To understand this problem we derive an approximation to the marginal likelihood and investigate its shape under different settings. Our investigation sheds some light on the problem and suggests the use of cross-validation or its approximation with WAIC (Watanabe, 2010) when the number of SNPs is relatively large. Properties of our Gibbs-WAIC approach are investigated using a simulation study and we apply the methodology to a large dataset collected as part of the Alzheimer's Disease Neuroimaging Initiative. / Graduate
883

Acoustic emission-based diagnostics and prognostics of slow rotating bearings using Bayesian techniques

Aye, S.A. (Sylvester Aondolumun) January 2014 (has links)
Diagnostics and prognostics in rotating machinery is a subject of much on-going research. There are three approaches to diagnostics and prognostics. These include experience-based approaches, data-driven techniques and model-based techniques. Bayesian data-driven techniques are gaining widespread application in diagnostics and prognostics of mechanical and allied systems including slow rotating bearings, as a result of their ability to handle the stochastic nature of the measured data well. The aim of the study is to detect incipient damage of slow rotating bearings and develop diagnostics which will be robust under changing operating conditions. Further it is required to explore and develop an optimal prognostic model for the prediction of remaining useful life (RUL) of slow rotating bearings. This research develops a novel integrated nonlinear method for the effective feature extraction from acoustic emission (AE) signals and the construction of a degradation assessment index (DAI), which is subsequently used for the fault diagnostics of slow rotating bearings. A slow rotating bearing test rig was developed to measure AE data under variable operational conditions. The proposed novel DAI obtained by the integration of the PKPCA (polynomial kernel principal component analysis), a Gaussian mixture model (GMM) and an exponentially weighted moving average (EWMA) is shown to be effective and suitable for monitoring the degradation of slow rotating bearings and is robust under variable operating conditions. Furthermore, this study integrates the novel DAI into alternative Bayesian methods for the prediction of RUL. The DAI is used as input in several Bayesian regression models such as the multi-layer perceptron (MLP), radial basis function (RBF), Bayesian linear regression (BLR), Gaussian mixture regression (GMR) and the Gaussian process regression (GPR) for RUL prediction. The combination of the DAI with the GPR model, otherwise, known as the DAI-GPR gives the best prediction. The findings show that the GPR model is suitable and effective in the prediction of RUL of slow rotating bearings and robust to varying operating conditions. Further, the models are also robust when the training and tests sets are obtained from dependent and independent samples. Finally, an optimal GPR for the prediction of RUL of slow rotating bearings based on a DAI is developed. The model performance is evaluated for cases where the training and test samples from cross validation approach are dependent as well as when they are independent. The optimal GPR is obtained from the integration or combination of existing simple mean and covariance functions in order to capture the observed trend of the bearing degradation as well as the irregularities in the data. The resulting integrated GPR model provides an excellent fit to the data and improvements over the simple GPR models that are based on simple mean and covariance functions. In addition, it achieves a near zero percentage error prediction of the RUL of slow rotating bearings when the training and test sets are from dependent samples but slightly different values when the estimation is based on independent samples. These findings are robust under varying operating conditions such as loading and speed. The proposed methodology can be applied to nonlinear and non-stationary machine response signals and is useful for preventive machine maintenance purposes. Keywords: acoustic emission, Bayesian linear regression, Bayesian techniques, covariance function, data-driven, degradation assessment index, diagnostics, experience-based, exponentially weighted moving average, Gaussian mixture model, Gaussian mixture regression, Gaussian process regression, integration, mean function, model-based, multi-layer perceptron, polynomial kernel principal component analysis, prognostics, radial basis function, remaining useful life. / Thesis (PhD)--University of Pretoria, 2014. / lk2014 / Mechanical and Aeronautical Engineering / PhD / unrestricted
884

Méthodes de modélisation bayésienne et applications en recherche clinique / Methods of bayesian modelling and applications in clinical research

Denis, Marie 01 July 2010 (has links)
Depuis quelques années un engouement pour les méthodes de modélisation bayésienne a été observé dans divers domaines comme l'environnement, la médecine. Les études en recherche clinique s'appuient sur la modélisation mathématique et l'inférence statistique. Le but de cette thèse est d'étudier les applications possibles d'une telle modélisation dans le cadre de la recherche clinique. En effet, les jugements, les connaissances des experts (médecins, biologistes..) sont nombreux et importants. Il semble donc naturel de vouloir prendre en compte toutes ces connaissances a priori dans le modèle statistique. Après un rappel sur les fondamentaux des statistiques bayésiennes, des travaux préliminaires dans le cadre de la théorie de la décision sont présentés ainsi qu'un état de l'art des méthodes d'approximation. Une méthode de Monte Carlo par chaînes de Markov à sauts réversibles a été mise en place dans le contexte de modèles bien connus en recherche clinique : le modèle de Cox et le modèle logistique. Une approche de sélection de modèle est proposée comme une alternative aux critères classiques dans le cadre de la régression spline. Enfin différentes applications des méthodes bayésiennes non paramétriques sont développées. Des algorithmes sont adaptés et implémentés afin de pouvoir appliquer de telles méthodes. Cette thèse permet de mettre en avant les méthodes bayésiennes de différentes façons dans le cadre de la recherche clinique au travers de plusieurs jeux de données. / For some years a craze for the methods of bayesian modelling was observed in diverse domains as the environment, the medicine. The studies in clinical research lies on the modelling mathematical and the statistical inference. The purpose of this thesis is to study the possible applications of such a modelling within the framework of the clinical research. Indeed, judgments, knowledge of the experts (doctors, biologists) are many and important. It thus seems natural to want to take into account all these knowledge a priori in the statistical model. After a background on the fundamental of the bayesian statistics, preliminary works within the framework of the theory of the decision are presented as well as a state of the art of the methods of approximation. A MCMC method with reversible jumps was organized in the context of models known well in clinical research : the model of Cox and the logistic model. An approach of selection of model is proposed as an alternative in the classic criteria within the framework of the regression spline. Finally various applications of the nonparametric bayesian methods are developed. Algorithms are adapted and implemented to be able to apply such methods. This thesis allows to advance the bayesian methods in various ways within the framework of the clinical research through several sets of data.
885

Improving the Computational Efficiency in Bayesian Fitting of Cormack-Jolly-Seber Models with Individual, Continuous, Time-Varying Covariates

Burchett, Woodrow 01 January 2017 (has links)
The extension of the CJS model to include individual, continuous, time-varying covariates relies on the estimation of covariate values on occasions on which individuals were not captured. Fitting this model in a Bayesian framework typically involves the implementation of a Markov chain Monte Carlo (MCMC) algorithm, such as a Gibbs sampler, to sample from the posterior distribution. For large data sets with many missing covariate values that must be estimated, this creates a computational issue, as each iteration of the MCMC algorithm requires sampling from the full conditional distributions of each missing covariate value. This dissertation examines two solutions to address this problem. First, I explore variational Bayesian algorithms, which derive inference from an approximation to the posterior distribution that can be fit quickly in many complex problems. Second, I consider an alternative approximation to the posterior distribution derived by truncating the individual capture histories in order to reduce the number of missing covariates that must be updated during the MCMC sampling algorithm. In both cases, the increased computational efficiency comes at the cost of producing approximate inferences. The variational Bayesian algorithms generally do not estimate the posterior variance very accurately and do not directly address the issues with estimating many missing covariate values. Meanwhile, the truncated CJS model provides a more significant improvement in computational efficiency while inflating the posterior variance as a result of discarding some of the data. Both approaches are evaluated via simulation studies and a large mark-recapture data set consisting of cliff swallow weights and capture histories.
886

Probabilistic Models of Topics and Social Events

Wei, Wei 01 December 2016 (has links)
Structured probabilistic inference has shown to be useful in modeling complex latent structures of data. One successful way in which this technique has been applied is in the discovery of latent topical structures of text data, which is usually referred to as topic modeling. With the recent popularity of mobile devices and social networking, we can now easily acquire text data attached to meta information, such as geo-spatial coordinates and time stamps. This metadata can provide rich and accurate information that is helpful in answering many research questions related to spatial and temporal reasoning. However, such data must be treated differently from text data. For example, spatial data is usually organized in terms of a two dimensional region while temporal information can exhibit periodicities. While some work existing in the topic modeling community that utilizes some of the meta information, these models largely focused on incorporating metadata into text analysis, rather than providing models that make full use of the joint distribution of metainformation and text. In this thesis, I propose the event detection problem, which is a multidimensional latent clustering problem on spatial, temporal and topical data. I start with a simple parametric model to discover independent events using geo-tagged Twitter data. The model is then improved toward two directions. First, I augmented the model using Recurrent Chinese Restaurant Process (RCRP) to discover events that are dynamic in nature. Second, I studied a model that can detect events using data from multiple media sources. I studied the characteristics of different media in terms of reported event times and linguistic patterns. The approaches studied in this thesis are largely based on Bayesian nonparametric methods to deal with steaming data and unpredictable number of clusters. The research will not only serve the event detection problem itself but also shed light into a more general structured clustering problem in spatial, temporal and textual data.
887

Interpretable and Scalable Bayesian Models for Advertising and Text

Bischof, Jonathan Michael 04 June 2016 (has links)
In the era of "big data", scalable statistical inference is necessary to learn from new and growing sources of quantitative information. However, many commercial and scientific applications also require models to be interpretable to end users in order to generate actionable insights about quantities of interest. We present three case studies of Bayesian hierarchical models that improve the interpretability of existing models while also maintaining or improving the efficiency of inference. The first paper is an application to online advertising that presents an augmented regression model interpretable in terms of the amount of revenue a customer is expected to generate over his or her entire relationship with the company---even if complete histories are never observed. The resulting Poisson Process Regression employs a marginal inference strategy that avoids specifying customer-level latent variables used in previous work that complicate inference and interpretability. The second and third papers are applications to the analysis of text data that propose improved summaries of topic components discovered by these mixture models. While the current practice is to summarize topics in terms of their most frequent words, we show significantly greater interpretability in online experiments with human evaluators by using words that are also relatively exclusive to the topic of interest. In the process we develop a new class of topic models that directly regularize the differential usage of words across topics in order to produce stable estimates of the combined frequency-exclusivity metric as well as proposing efficient and parallelizable MCMC inference strategies. / Statistics
888

Prediction of protein secondary structure using binary classificationtrees, naive Bayes classifiers and the Logistic Regression Classifier

Eldud Omer, Ahmed Abdelkarim January 2016 (has links)
The secondary structure of proteins is predicted using various binary classifiers. The data are adopted from the RS126 database. The original data consists of protein primary and secondary structure sequences. The original data is encoded using alphabetic letters. These data are encoded into unary vectors comprising ones and zeros only. Different binary classifiers, namely the naive Bayes, logistic regression and classification trees using hold-out and 5-fold cross validation are trained using the encoded data. For each of the classifiers three classification tasks are considered, namely helix against not helix (H/∼H), sheet against not sheet (S/∼S) and coil against not coil (C/∼C). The performance of these binary classifiers are compared using the overall accuracy in predicting the protein secondary structure for various window sizes. Our result indicate that hold-out cross validation achieved higher accuracy than 5-fold cross validation. The Naive Bayes classifier, using 5-fold cross validation achieved, the lowest accuracy for predicting helix against not helix. The classification tree classifiers, using 5-fold cross validation, achieved the lowest accuracies for both coil against not coil and sheet against not sheet classifications. The logistic regression classier accuracy is dependent on the window size; there is a positive relationship between the accuracy and window size. The logistic regression classier approach achieved the highest accuracy when compared to the classification tree and Naive Bayes classifiers for each classification task; predicting helix against not helix with accuracy 77.74 percent, for sheet against not sheet with accuracy 81.22 percent and for coil against not coil with accuracy 73.39 percent. It is noted that it is easier to compare classifiers if the classification process could be completely facilitated in R. Alternatively, it would be easier to assess these logistic regression classifiers if SPSS had a function to determine the accuracy of the logistic regression classifier.
889

Discriminative and Bayesian techniques for hidden Markov model speech recognition systems

Purnell, Darryl William 31 October 2005 (has links)
The collection of large speech databases is not a trivial task (if done properly). It is not always possible to collect, segment and annotate large databases for every task or language. It is also often the case that there are imbalances in the databases, as a result of little data being available for a specific subset of individuals. An example of one such imbalance is the fact that there are often more male speakers than female speakers (or vice-versa). If there are, for example, far fewer female speakers than male speakers, then the recognizers will tend to work poorly for female speakers (as compared to performance for male speakers). This thesis focuses on using Bayesian and discriminative training algorithms to improve continuous speech recognition systems in scenarios where there is a limited amount of training data available. The research reported in this thesis can be divided into three categories: • Overspecialization is characterized by good recognition performance for the data used during training, but poor recognition performance for independent testing data. This is a problem when too little data is available for training purposes. Methods of reducing overspecialization in the minimum classification error algo¬rithm are therefore investigated. • Development of new Bayesian and discriminative adaptation/training techniques that can be used in situations where there is a small amount of data available. One example here is the situation where an imbalance in terms of numbers of male and female speakers exists and these techniques can be used to improve recognition performance for female speakers, while not decreasing recognition performance for the male speakers. • Bayesian learning, where Bayesian training is used to improve recognition perfor¬mance in situations where one can only use the limited training data available. These methods are extremely computationally expensive, but are justified by the improved recognition rates for certain tasks. This is, to the author's knowledge, the first time that Bayesian learning using Markov chain Monte Carlo methods have been used in hidden Markov model speech recognition. The algorithms proposed and reviewed are tested using three different datasets (TIMIT, TIDIGITS and SUNSpeech), with the tasks being connected digit recognition and con¬tinuous speech recognition. Results indicate that the proposed algorithms improve recognition performance significantly for situations where little training data is avail¬able. / Thesis (PhD (Electronic Engineering))--University of Pretoria, 2006. / Electrical, Electronic and Computer Engineering / unrestricted
890

Identification de la zone regardée sur un écran d'ordinateur à partir du flou

Néron, Eric January 2017 (has links)
Quand vient le temps de comprendre le comportement d’une personne, le regard est une source d’information importante. L’analyse des comportements des consommateurs, des criminels, ou encore de certains états cognitifs passe par l’interprétation du regard dans une scène à travers le temps. Il existe un besoin réel d’identification de la zone regardée sur un écran ou tout autre médium par un utilisateur. Pour cela, la vision humaine fait la composition de plusieurs images pour permettre de comprendre la relation tridimensionnelle qui existe entre les objets et la scène. La perception 3D d’une scène réelle passe alors à travers plusieurs images. Mais qu’en est-il lorsqu’il n’y a qu’une seule image ?

Page generated in 0.0347 seconds