Spelling suggestions: "subject:"bayesian 1earning"" "subject:"bayesian c1earning""
1 |
Bayesian locally weighted online learningEdakunni, Narayanan U. January 2010 (has links)
Locally weighted regression is a non-parametric technique of regression that is capable of coping with non-stationarity of the input distribution. Online algorithms like Receptive FieldWeighted Regression and Locally Weighted Projection Regression use a sparse representation of the locally weighted model to approximate a target function, resulting in an efficient learning algorithm. However, these algorithms are fairly sensitive to parameter initializations and have multiple open learning parameters that are usually set using some insights of the problem and local heuristics. In this thesis, we attempt to alleviate these problems by using a probabilistic formulation of locally weighted regression followed by a principled Bayesian inference of the parameters. In the Randomly Varying Coefficient (RVC) model developed in this thesis, locally weighted regression is set up as an ensemble of regression experts that provide a local linear approximation to the target function. We train the individual experts independently and then combine their predictions using a Product of Experts formalism. Independent training of experts allows us to adapt the complexity of the regression model dynamically while learning in an online fashion. The local experts themselves are modeled using a hierarchical Bayesian probability distribution with Variational Bayesian Expectation Maximization steps to learn the posterior distributions over the parameters. The Bayesian modeling of the local experts leads to an inference procedure that is fairly insensitive to parameter initializations and avoids problems like overfitting. We further exploit the Bayesian inference procedure to derive efficient online update rules for the parameters. Learning in the regression setting is also extended to handle a classification task by making use of a logistic regression to model discrete class labels. The main contribution of the thesis is a spatially localised online learning algorithm set up in a probabilistic framework with principled Bayesian inference rule for the parameters of the model that learns local models completely independent of each other, uses only local information and adapts the local model complexity in a data driven fashion. This thesis, for the first time, brings together the computational efficiency and the adaptability of ‘non-competitive’ locally weighted learning schemes and the modelling guarantees of the Bayesian formulation.
|
2 |
Bayesian analysis of some pricing and discounting modelsZantedeschi, Daniel 13 July 2012 (has links)
The dissertation comprises an introductory Chapter, four papers and
a summary Chapter.
First, a new class of Bayesian dynamic partition models for the Nelson-
Siegel family of non-linear state-space Bayesian statistical models is developed.
This class is applied to studying the term structure of government
yields. A sequential time series of Bayes factors, which is developed from
this approach, shows that term structure could act as a leading indicator of
economic activity.
Second, we develop a class of non-MCMC algorithms called “Direct
Sampling”. This Chapter extends the basic algorithm with applications to
Generalized Method of Moments and Affine Term Structure Models.
Third, financial economics is characterized by long-standing problems
such as the equity premium and risk free rate puzzles. In the chapter
titled “Bayesian Learning, Distributional Uncertainty and Asset-Return Puzzles” solutions for equilibrium prices under a set of subjective beliefs
generated by Dirichlet Process priors are developed. It is shown that the
“puzzles” could disappear if a “tail thickening” effect is induced by the representative
agent. A novel Bayesian methodology for retrospective calibration
of the model from historical data is developed. This approach shows
how predictive functionals have important welfare implications towards
long-term growth.
Fourth, in “Social Discounting Using a Bayesian Nonparametric model”
the problem of how to better quantify the uncertainty in long-term investments
is considered from a Bayesian perspective. By incorporating distribution
uncertainty, we are able to provide confidence measures that are less
“pessimistic” when compared to previous studies. These measures shed a
new and different light when considering important cost-benefit analysis
such as the valuation of environmental policies towards the resolution of
global warming.
Finally, the last Chapter discusses directions for future research and
concludes the dissertation. / text
|
3 |
Bayesian learning with catastrophe risk : information externalities in a large economyZantedeschi, Daniel 30 September 2011 (has links)
Based on a previous study by Amador and Weill (2009), I study the
diffusion of dispersed private information in a large economy subject to a
”catastrophe risk” state. I assume that agents learn from the actions of oth-
ers through two channels: a public channel, that represents learning from
prices, and a bi-dimensional private channel that represents learning from lo-
cal interactions via information concerning the good state and the catastrophe
probability. I show an equilibrium solution based on conditional Bayes rule,
which weakens the usual condition of ”slow learning” as presented in Amador
and Weill and first introduced by Vives (1993). I study asymptotic conver-
gence ”to the truth” deriving that ”catastrophe risk” can lead to ”non-linear”
adjustments that could in principle explain fluctuations of price aggregates.
I finally discuss robustness issues and potential applications of this work to
models of ”reaching consensus”, ”investments under uncertainty”, ”market
efficiency” and ”prediction markets”. / text
|
4 |
Weakly Supervised Learning Algorithms and an Application to ElectromyographyHesham, Tameem January 2014 (has links)
In the standard machine learning framework, training data is assumed to be fully supervised. However, collecting fully labelled data is not always easy. Due to cost, time, effort or other types of constraints, requiring the whole data to be labelled can be difficult in many applications, whereas collecting unlabelled data can be relatively easy. Therefore, paradigms that enable learning from unlabelled and/or partially labelled data have been growing recently in machine learning. The focus of this thesis is to provide algorithms that enable weakly annotating unlabelled parts of data not provided in the standard supervised setting consisting of an instance-label pair for each sample, then learning from weakly as well as strongly labelled data. More specifically, the bulk of the thesis aims at finding solutions for data that come in the form of bags or groups of instances where available information about the labels is at the bag level only. This is the form of the electromyographic (EMG) data, which represent the main application of the thesis. Electromyographic (EMG) data can be used to diagnose muscles as either normal or suffering from a neuromuscular disease. Muscles can be classified into one of three labels; normal, myopathic or neurogenic. Each muscle consists of motor units (MUs). Equivalently, an EMG signal detected from a muscle consists of motor unit potential trains (MUPTs). This data is an example of partially labelled data where instances (MUs) are grouped in bags (muscles) and labels are provided for bags but not for instances.
First, we introduce and investigate a weakly supervised learning paradigm that aims at improving classification performance by using a spectral graph-theoretic approach to weakly annotate unlabelled instances before classification. The spectral graph-theoretic phase of this paradigm groups unlabelled data instances using similarity graph models. Two new similarity graph models are introduced as well in this paradigm. This paradigm improves overall bag accuracy for EMG datasets.
Second, generative modelling approaches for multiple-instance learning (MIL) are presented. We introduce and analyse a variety of model structures and components of these generative models and believe it can serve as a methodological guide to other MIL tasks of similar form. This approach improves overall bag accuracy, especially for low-dimensional bags-of-instances datasets like EMG datasets.
MIL generative models provide an example of models where probability distributions need to be represented compactly and efficiently, especially when number of variables of a certain model is large. Sum-product networks (SPNs) represent a relatively new class of deep probabilistic models that aims at providing a compact and tractable representation of a probability distribution. SPNs are used to model the joint distribution of instance features in the MIL generative models. An SPN whose structure is learnt by a structure learning algorithm introduced in this thesis leads to improved bag accuracy for higher-dimensional datasets.
|
5 |
An Integrated Two-stage Innovation Planning Model with Market Segmented Learning and Network DynamicsFerreira, Kevin D. 28 February 2013 (has links)
Innovation diffusion models have been studied extensively to forecast and explain the adoption process for new products or services. These models are often formulated using one of two approaches: The first, and most common is a macro-level approach that aggregates much of the market behaviour. An advantage of this method is that forecasts and other analyses may be performed with the necessity of estimating few parameters. The second is a micro-level approach that aims to utilize microeconomic information pertaining to the potential market and the innovation. The advantage of this methodology is that analyses allow for a direct understanding of how potential customers view the innovation. Nevertheless, when individuals are making adoption decisions, the reality of the situation is that the process consists of at least two stages: First, a potential adopter must become aware of the innovation; and second the aware individual must decide to adopt. Researchers, have studied multi-stage diffusion processes in the past, however a majority of these works employ a macro-level approach to model market flows. As a result, a direct understanding of how individuals value the innovation is lacking, making it impossible to utilize this information to model realistic word-of-mouth behaviour and other network dynamics. Thus, we propose a two-stage integrated model that utilizes the benefits of both the macro- and micro-level approaches. In the first stage, potential customers become aware of the innovation, which requires no decision making by the individual. As a result, we employ a macro-level diffusion process to describe the first stage. However, in the second stage potential customers decide whether to adopt the innovation or not, and we utilize a micro-level methodology to model this. We further extend the application to include forward looking behaviour, heterogeneous adopters and segmented Bayesian learning, and utilize the adopter's satisfaction levels to describe biasing and word-of-mouth behaviour. We apply the proposed model to Canadian colour-TV data, and cross-validation results suggest that the new model has excellent predictive capabilities. We also apply the two-stage model to early U.S. hybrid-electric vehicle data and results provide insightful managerial observations.
|
6 |
An Integrated Two-stage Innovation Planning Model with Market Segmented Learning and Network DynamicsFerreira, Kevin D. 28 February 2013 (has links)
Innovation diffusion models have been studied extensively to forecast and explain the adoption process for new products or services. These models are often formulated using one of two approaches: The first, and most common is a macro-level approach that aggregates much of the market behaviour. An advantage of this method is that forecasts and other analyses may be performed with the necessity of estimating few parameters. The second is a micro-level approach that aims to utilize microeconomic information pertaining to the potential market and the innovation. The advantage of this methodology is that analyses allow for a direct understanding of how potential customers view the innovation. Nevertheless, when individuals are making adoption decisions, the reality of the situation is that the process consists of at least two stages: First, a potential adopter must become aware of the innovation; and second the aware individual must decide to adopt. Researchers, have studied multi-stage diffusion processes in the past, however a majority of these works employ a macro-level approach to model market flows. As a result, a direct understanding of how individuals value the innovation is lacking, making it impossible to utilize this information to model realistic word-of-mouth behaviour and other network dynamics. Thus, we propose a two-stage integrated model that utilizes the benefits of both the macro- and micro-level approaches. In the first stage, potential customers become aware of the innovation, which requires no decision making by the individual. As a result, we employ a macro-level diffusion process to describe the first stage. However, in the second stage potential customers decide whether to adopt the innovation or not, and we utilize a micro-level methodology to model this. We further extend the application to include forward looking behaviour, heterogeneous adopters and segmented Bayesian learning, and utilize the adopter's satisfaction levels to describe biasing and word-of-mouth behaviour. We apply the proposed model to Canadian colour-TV data, and cross-validation results suggest that the new model has excellent predictive capabilities. We also apply the two-stage model to early U.S. hybrid-electric vehicle data and results provide insightful managerial observations.
|
7 |
Comparison of Bayesian learning and conjugate gradient descent training of neural networksNortje, W D 09 November 2004 (has links)
Neural networks are used in various fields to make predictions about the future value of a time series, or about the class membership of a given object. For the network to be effective, it needs to be trained on a set of training data combined with the expected results. Two aspects to keep in mind when considering a neural network as a solution, are the required training time and the prediction accuracy. This research compares the classification accuracy of conjugate gradient descent neural networks and Bayesian learning neural networks. Conjugate gradient descent networks are known for their short training times, but are not very consistent and results are heavily dependant on initial training conditions. Bayesian networks are slower, but much more consistent. The two types of neural networks are compared, and some attempts are made to combine their strong points in order to achieve shorter training times while maintaining a high classification accuracy. Bayesian learning outperforms the gradient descent methods by almost 1%, while the hybrid method achieves results between those of Bayesian learning and gradient descent. The drawback of the hybrid method is that there is no speed improvement above that of Bayesian learning. / Dissertation (MEng (Electronics))--University of Pretoria, 2005. / Electrical, Electronic and Computer Engineering / unrestricted
|
8 |
Impact of Attention on Perception in Cognitive Dynamic SystemsAmiri, Ashkan 30 September 2014 (has links)
The proposed aim of this thesis, inspired by the human brain, is to improve on the performance of a perceptual processing algorithm, referred to as a “perceptor”. This is done by trying to bridge the gap between neuroscience and engineering. To this end, we build on localized perception-action cycle in cognitive neuroscience by categorizing it under the umbrella of perceptual attention, which lends itself to increase gradually the contrast between relevant information and irrelevant information. Stated in another way, irrelevant information is filtered away while relevant information about the environment is enhanced from one cycle to the next. Accordingly, we propose to improve on the performance of a perceptor by modifying it to operate under the influence of perceptual attention. For this purpose, we first start with a single-layered perceptor and investigate the impact of perceptual attention on its performance through two computer experiments: The first experiment uses simulated (real-valued) data that are generated to purposely make the problem challenging. The second experiment uses real-life radar data that are complex-valued, hence the proposal to introduce Wirtinger calculus into derivation of our proposed method. We then take one step further and extend our proposed method to the case where a perceptor is hierarchical. In this context, every constitutive component of a hierarchical perceptor is modified to operate under the influence of perceptual attention. Then, another experiment is carried out to demonstrate the positive impact of perceptual attention on the performance of that hierarchical perceptor, just
described. / Dissertation / Doctor of Philosophy (PhD)
|
9 |
Orbital Level Understanding of Adsorbate-Surface Interactions in Metal NanocatalysisWang, Siwen 15 June 2020 (has links)
We develop a theoretical framework for a priori estimation of catalytic activity of metal nanoparticles using geometry-based reactivity descriptors of surface atoms and kinetic analysis of reaction pathways at various types of active sites. We show that orbitalwise coordination numbers 𝐶𝑁<sup>α</sup> (α = 𝑠 or 𝑑) can be used to predict chemical reactivity of a metal site (e.g., adsorption energies of critical reaction intermediates) by being aware of the neighboring chemical environment, outperforming their regular (𝐶𝑁) and generalized (𝐶̅𝑁̅) counterparts with little added computational cost. Here we include two examples to illustrate this method: CO oxidation on Au (5𝑑¹⁰6𝑠¹) and O₂ reduction on Pt (5𝑑⁹6𝑠¹). We also employ Bayesian learning and the Newns-Anderson model to advance the fundamental understanding of adsorbate-surface interactions on metal nanocatalysts, paving the path toward adsorbate-specific tuning of catalysis. / Doctor of Philosophy / The interactions between reaction intermediates and catalysts should be neither too strong nor too weak for catalytic optimization. This Sabatiers principle arising from the scaling relations among the energetics of reacting species at geometrically similar sites, provides the conceptual basis for designing improved catalysts, but imposes volcano-type limitations on the attainable catalytic activity and selectivity. One of the greatest challenges faced by the catalysis community today is how to develop design strategies and ultimately predictive models of catalytic systems that could circumvent energy scaling relations. This work brings the quantum-chemical modeling and machine learning technique together and develops a novel stochastic modeling approach to rationally design the catalysts with desired properties and bridges our knowledge gap between the empirical kinetics and atomistic mechanisms of catalytic reactions.
|
10 |
Essays in Empirical Operations Management: Bayesian Learning of Service Quality and Structural Estimation of Complementary Product Pricing and Inventory ManagementShang, Yan January 2016 (has links)
<p>This dissertation contributes to the rapidly growing empirical research area in the field of operations management. It contains two essays, tackling two different sets of operations management questions which are motivated by and built on field data sets from two very different industries --- air cargo logistics and retailing. </p><p>The first essay, based on the data set obtained from a world leading third-party logistics company, develops a novel and general Bayesian hierarchical learning framework for estimating customers' spillover learning, that is, customers' learning about the quality of a service (or product) from their previous experiences with similar yet not identical services. We then apply our model to the data set to study how customers' experiences from shipping on a particular route affect their future decisions about shipping not only on that route, but also on other routes serviced by the same logistics company. We find that customers indeed borrow experiences from similar but different services to update their quality beliefs that determine future purchase decisions. Also, service quality beliefs have a significant impact on their future purchasing decisions. Moreover, customers are risk averse; they are averse to not only experience variability but also belief uncertainty (i.e., customer's uncertainty about their beliefs). Finally, belief uncertainty affects customers' utilities more compared to experience variability. </p><p>The second essay is based on a data set obtained from a large Chinese supermarket chain, which contains sales as well as both wholesale and retail prices of un-packaged perishable vegetables. Recognizing the special characteristics of this particularly product category, we develop a structural estimation model in a discrete-continuous choice model framework. Building on this framework, we then study an optimization model for joint pricing and inventory management strategies of multiple products, which aims at improving the company's profit from direct sales and at the same time reducing food waste and thus improving social welfare.</p><p>Collectively, the studies in this dissertation provide useful modeling ideas, decision tools, insights, and guidance for firms to utilize vast sales and operations data to devise more effective business strategies.</p> / Dissertation
|
Page generated in 0.0749 seconds