31 |
Essays on Aggregation and Cointegration of Econometric ModelsSilvestrini, Andrea 02 June 2009 (has links)
This dissertation can be broadly divided into two independent parts. The first three chapters analyse issues related to temporal and contemporaneous aggregation of econometric models. The fourth chapter contains an application of Bayesian techniques to investigate whether the post transition fiscal policy of Poland is sustainable in the long run and consistent with an intertemporal budget constraint.
Chapter 1 surveys the econometric methodology of temporal aggregation for a wide range of univariate and multivariate time series models.
A unified overview of temporal aggregation techniques for this broad class of processes is presented in the first part of the chapter and the main results are summarized. In each case, assuming to know the underlying process at the disaggregate frequency, the aim is to find the appropriate model for the aggregated data. Additional topics concerning temporal aggregation of ARIMA-GARCH models (see Drost and Nijman, 1993) are discussed and several examples presented. Systematic sampling schemes are also reviewed.
Multivariate models, which show interesting features under temporal aggregation (Breitung and Swanson, 2002, Marcellino, 1999, Hafner, 2008), are examined in the second part of the chapter. In particular, the focus is on temporal aggregation of VARMA models and on the related concept of spurious instantaneous causality, which is not a time series property invariant to temporal aggregation. On the other hand, as pointed out by Marcellino (1999), other important time series features as cointegration and presence of unit roots are invariant to temporal aggregation and are not induced by it.
Some empirical applications based on macroeconomic and financial data illustrate all the techniques surveyed and the main results.
Chapter 2 is an attempt to monitor fiscal variables in the Euro area, building an early warning signal indicator for assessing the development of public finances in the short-run and exploiting the existence of monthly budgetary statistics from France, taken as "example country".
The application is conducted focusing on the cash State deficit, looking at components from the revenue and expenditure sides. For each component, monthly ARIMA models are estimated and then temporally aggregated to the annual frequency, as the policy makers are interested in yearly predictions.
The short-run forecasting exercises carried out for years 2002, 2003 and 2004 highlight the fact that the one-step-ahead predictions based on the temporally aggregated models generally outperform those delivered by standard monthly ARIMA modeling, as well as the official forecasts made available by the French government, for each of the eleven components and thus for the whole State deficit. More importantly, by the middle of the year, very accurate predictions for the current year are made available.
The proposed method could be extremely useful, providing policy makers with a valuable indicator when assessing the development of public finances in the short-run (one year horizon or even less).
Chapter 3 deals with the issue of forecasting contemporaneous time series aggregates. The performance of "aggregate" and "disaggregate" predictors in forecasting contemporaneously aggregated vector ARMA (VARMA) processes is compared. An aggregate predictor is built by forecasting directly the aggregate process, as it results from contemporaneous aggregation of the data generating vector process. A disaggregate predictor is a predictor obtained from aggregation of univariate forecasts for the individual components of the data generating vector process.
The econometric framework is broadly based on Lütkepohl (1987). The necessary and sufficient condition for the equality of mean squared errors associated with the two competing methods in the bivariate VMA(1) case is provided. It is argued that the condition of equality of predictors as stated in Lütkepohl (1987), although necessary and sufficient for the equality of the predictors, is sufficient (but not necessary) for the equality of mean squared errors.
Furthermore, it is shown that the same forecasting accuracy for the two predictors can be achieved using specific assumptions on the parameters of the VMA(1) structure.
Finally, an empirical application that involves the problem of forecasting the Italian monetary aggregate M1 on the basis of annual time series ranging from 1948 until 1998, prior to the creation of the European Economic and Monetary Union (EMU), is presented to show the relevance of the topic. In the empirical application, the framework is further generalized to deal with heteroskedastic and cross-correlated innovations.
Chapter 4 deals with a cointegration analysis applied to the empirical investigation of fiscal sustainability. The focus is on a particular country: Poland. The choice of Poland is not random. First, the motivation stems from the fact that fiscal sustainability is a central topic for most of the economies of Eastern Europe. Second, this is one of the first countries to start the transition process to a market economy (since 1989), providing a relatively favorable institutional setting within which to study fiscal sustainability (see Green, Holmes and Kowalski, 2001). The emphasis is on the feasibility of a permanent deficit in the long-run, meaning whether a government can continue to operate under its current fiscal policy indefinitely.
The empirical analysis to examine debt stabilization is made up by two steps.
First, a Bayesian methodology is applied to conduct inference about the cointegrating relationship between budget revenues and (inclusive of interest) expenditures and to select the cointegrating rank. This task is complicated by the conceptual difficulty linked to the choice of the prior distributions for the parameters relevant to the economic problem under study (Villani, 2005).
Second, Bayesian inference is applied to the estimation of the normalized cointegrating vector between budget revenues and expenditures. With a single cointegrating equation, some known results concerning the posterior density of the cointegrating vector may be used (see Bauwens, Lubrano and Richard, 1999).
The priors used in the paper leads to straightforward posterior calculations which can be easily performed.
Moreover, the posterior analysis leads to a careful assessment of the magnitude of the cointegrating vector. Finally, it is shown to what extent the likelihood of the data is important in revising the available prior information, relying on numerical integration techniques based on deterministic methods.
|
32 |
Bayesian Inference for Stochastic Volatility ModelsMen, Zhongxian January 1012 (has links)
Stochastic volatility (SV) models provide a natural framework for a
representation of time series for financial asset returns. As a
result, they have become increasingly popular in the finance
literature, although they have also been applied in other fields
such as signal processing, telecommunications, engineering, biology,
and other areas.
In working with the SV models, an important issue arises as how to
estimate their parameters efficiently and to assess how well they
fit real data. In the literature, commonly used estimation methods
for the SV models include general methods of moments, simulated
maximum likelihood methods, quasi Maximum likelihood method, and
Markov Chain Monte Carlo (MCMC) methods. Among these approaches,
MCMC methods are most flexible in dealing with complicated structure
of the models. However, due to the difficulty in the selection of
the proposal distribution for Metropolis-Hastings methods, in
general they are not easy to implement and in some cases we may also
encounter convergence problems in the implementation stage. In the
light of these concerns, we propose in this thesis new estimation
methods for univariate and multivariate SV models. In the simulation
of latent states of the heavy-tailed SV models, we recommend the
slice sampler algorithm as the main tool to sample the proposal
distribution when the Metropolis-Hastings method is applied. For the
SV models without heavy tails, a simple Metropolis-Hastings method
is developed for simulating the latent states. Since the slice
sampler can adapt to the analytical structure of the underlying
density, it is more efficient. A sample point can be obtained from
the target distribution with a few iterations of the sampler,
whereas in the original Metropolis-Hastings method many sampled
values often need to be discarded.
In the analysis of multivariate time series, multivariate SV models
with more general specifications have been proposed to capture the
correlations between the innovations of the asset returns and those
of the latent volatility processes. Due to some restrictions on the
variance-covariance matrix of the innovation vectors, the estimation
of the multivariate SV (MSV) model is challenging. To tackle this
issue, for a very general setting of a MSV model we propose a
straightforward MCMC method in which a Metropolis-Hastings method is
employed to sample the constrained variance-covariance matrix, where
the proposal distribution is an inverse Wishart distribution. Again,
the log volatilities of the asset returns can then be simulated via
a single-move slice sampler.
Recently, factor SV models have been proposed to extract hidden
market changes. Geweke and Zhou (1996) propose a factor SV model
based on factor analysis to measure pricing errors in the context of
the arbitrage pricing theory by letting the factors follow the
univariate standard normal distribution. Some modification of this
model have been proposed, among others, by Pitt and Shephard (1999a)
and Jacquier et al. (1999). The main feature of the factor SV
models is that the factors follow a univariate SV process, where the
loading matrix is a lower triangular matrix with unit entries on the
main diagonal. Although the factor SV models have been successful in
practice, it has been recognized that the order of the component may
affect the sample likelihood and the selection of the factors.
Therefore, in applications, the component order has to be considered
carefully. For instance, the factor SV model should be fitted to
several permutated data to check whether the ordering affects the
estimation results. In the thesis, a new factor SV model is
proposed. Instead of setting the loading matrix to be lower
triangular, we set it to be column-orthogonal and assume that each
column has unit length. Our method removes the permutation problem,
since when the order is changed then the model does not need to be
refitted. Since a strong assumption is imposed on the loading
matrix, the estimation seems even harder than the previous factor
models. For example, we have to sample columns of the loading matrix
while keeping them to be orthonormal. To tackle this issue, we use
the Metropolis-Hastings method to sample the loading matrix one
column at a time, while the orthonormality between the columns is
maintained using the technique proposed by Hoff (2007). A von
Mises-Fisher distribution is sampled and the generated vector is
accepted through the Metropolis-Hastings algorithm.
Simulation studies and applications to real data are conducted to
examine our inference methods and test the fit of our model.
Empirical evidence illustrates that our slice sampler within MCMC
methods works well in terms of parameter estimation and volatility
forecast. Examples using financial asset return data are provided to
demonstrate that the proposed factor SV model is able to
characterize the hidden market factors that mainly govern the
financial time series. The Kolmogorov-Smirnov tests conducted on
the estimated models indicate that the models do a reasonable job in
terms of describing real data.
|
33 |
Bayesian inference for source determination in the atmospheric environmentKeats, William Andrew January 2009 (has links)
In the event of a hazardous release (chemical, biological, or radiological) in an urban environment, monitoring agencies must have the tools to locate and characterize the source of the emission in order to respond and minimize damage. Given a finite and noisy set of concentration measurements, determining the source location, strength and time of release is an ill-posed inverse problem. We treat this problem using Bayesian inference, a framework under which uncertainties in modelled and measured concentrations can be propagated, in a consistent, rigorous manner, toward a final probabilistic estimate for the source.
The Bayesian methodology operates independently of the chosen dispersion model, meaning it can be applied equally well to problems in urban environments, at regional scales, or at global scales. Both Lagrangian stochastic (particle-tracking) and Eulerian (fixed-grid, finite-volume) dispersion models have been used successfully. Calculations are accomplished efficiently by using adjoint (backward) dispersion models, which reduces the computational effort required from calculating one [forward] plume per possible source configuration to calculating one [backward] plume per detector. Markov chain Monte Carlo (MCMC) is used to efficiently sample from the posterior distribution for the source parameters; both the Metropolis-Hastings and hybrid Hamiltonian algorithms are used.
In this thesis, four applications falling under the rubric of source determination are addressed: dispersion in highly disturbed flow fields characteristic of built-up (urban) environments; dispersion of a nonconservative scalar over flat terrain in a statistically stationary and horizontally homogeneous (turbulent) wind field; optimal placement of an auxiliary detector using a decision-theoretic approach; and source apportionment of particulate matter (PM) using a chemical mass balance (CMB) receptor model. For the first application, the data sets used to validate the proposed methodology include a water-channel simulation of the near-field dispersion of contaminant plumes in a large array of building-like obstacles (Mock Urban Setting Trial) and a full-scale field experiment (Joint Urban 2003) in Oklahoma City. For the second and third applications, the background wind and terrain conditions are based on those encountered during the Project Prairie Grass field experiment; mean concentration and turbulent scalar flux data are synthesized using a Lagrangian stochastic model where necessary. In the fourth and final application, Bayesian source apportionment results are compared to the US Environmental Protection Agency's standard CMB model using a test case involving PM data from Fresno, California. For each of the applications addressed in this thesis, combining Bayesian inference with appropriate computational techniques results in a computationally efficient methodology for performing source determination.
|
34 |
Bayesian inference for source determination in the atmospheric environmentKeats, William Andrew January 2009 (has links)
In the event of a hazardous release (chemical, biological, or radiological) in an urban environment, monitoring agencies must have the tools to locate and characterize the source of the emission in order to respond and minimize damage. Given a finite and noisy set of concentration measurements, determining the source location, strength and time of release is an ill-posed inverse problem. We treat this problem using Bayesian inference, a framework under which uncertainties in modelled and measured concentrations can be propagated, in a consistent, rigorous manner, toward a final probabilistic estimate for the source.
The Bayesian methodology operates independently of the chosen dispersion model, meaning it can be applied equally well to problems in urban environments, at regional scales, or at global scales. Both Lagrangian stochastic (particle-tracking) and Eulerian (fixed-grid, finite-volume) dispersion models have been used successfully. Calculations are accomplished efficiently by using adjoint (backward) dispersion models, which reduces the computational effort required from calculating one [forward] plume per possible source configuration to calculating one [backward] plume per detector. Markov chain Monte Carlo (MCMC) is used to efficiently sample from the posterior distribution for the source parameters; both the Metropolis-Hastings and hybrid Hamiltonian algorithms are used.
In this thesis, four applications falling under the rubric of source determination are addressed: dispersion in highly disturbed flow fields characteristic of built-up (urban) environments; dispersion of a nonconservative scalar over flat terrain in a statistically stationary and horizontally homogeneous (turbulent) wind field; optimal placement of an auxiliary detector using a decision-theoretic approach; and source apportionment of particulate matter (PM) using a chemical mass balance (CMB) receptor model. For the first application, the data sets used to validate the proposed methodology include a water-channel simulation of the near-field dispersion of contaminant plumes in a large array of building-like obstacles (Mock Urban Setting Trial) and a full-scale field experiment (Joint Urban 2003) in Oklahoma City. For the second and third applications, the background wind and terrain conditions are based on those encountered during the Project Prairie Grass field experiment; mean concentration and turbulent scalar flux data are synthesized using a Lagrangian stochastic model where necessary. In the fourth and final application, Bayesian source apportionment results are compared to the US Environmental Protection Agency's standard CMB model using a test case involving PM data from Fresno, California. For each of the applications addressed in this thesis, combining Bayesian inference with appropriate computational techniques results in a computationally efficient methodology for performing source determination.
|
35 |
An Evaluation of Clustering and Classification Algorithms in Life-Logging DevicesAmlinger, Anton January 2015 (has links)
Using life-logging devices and wearables is a growing trend in today’s society. These yield vast amounts of information, data that is not directly overseeable or graspable at a glance due to its size. Gathering a qualitative, comprehensible overview over this quantitative information is essential for life-logging services to serve its purpose. This thesis provides an overview comparison of CLARANS, DBSCAN and SLINK, representing different branches of clustering algorithm types, as tools for activity detection in geo-spatial data sets. These activities are then classified using a simple model with model parameters learned via Bayesian inference, as a demonstration of a different branch of clustering. Results are provided using Silhouettes as evaluation for geo-spatial clustering and a user study for the end classification. The results are promising as an outline for a framework of classification and activity detection, and shed lights on various pitfalls that might be encountered during implementation of such service.
|
36 |
Forward and inverse modeling of fire physics towards fire scene reconstructionsOverholt, Kristopher James 06 November 2013 (has links)
Fire models are routinely used to evaluate life safety aspects of building design projects and are being used more often in fire and arson investigations as well as reconstructions of firefighter line-of-duty deaths and injuries. A fire within a compartment effectively leaves behind a record of fire activity and history (i.e., fire signatures). Fire and arson investigators can utilize these fire signatures in the determination of cause and origin during fire reconstruction exercises. Researchers conducting fire experiments can utilize this record of fire activity to better understand the underlying physics. In all of these applications, the fire heat release rate (HRR), location of a fire, and smoke production are important parameters that govern the evolution of thermal conditions within a fire compartment. These input parameters can be a large source of uncertainty in fire models, especially in scenarios in which experimental data or detailed information on fire behavior are not available. To better understand fire behavior indicators related to soot, the deposition of soot onto surfaces was considered. Improvements to a soot deposition submodel were implemented in a computational fluid dynamics (CFD) fire model. To better understand fire behavior indicators related to fire size, an inverse HRR methodology was developed that calculates a transient HRR in a compartment based on measured temperatures resulting from a fire source. To address issues related to the uncertainty of input parameters, an inversion framework was developed that has applications towards fire scene reconstructions. Rather than using point estimates of input parameters, a statistical inversion framework based on the Bayesian inference approach was used to determine probability distributions of input parameters. These probability distributions contain uncertainty information about the input parameters and can be propagated through fire models to obtain uncertainty information about predicted quantities of interest. The Bayesian inference approach was applied to various fire problems and coupled with zone and CFD fire models to extend the physical capability and accuracy of the inversion framework. Example applications include the estimation of both steady-state and transient fire sizes in a compartment, material properties related to pyrolysis, and the location of a fire in a compartment. / text
|
37 |
Value of information and the accuracy of discrete approximationsRamakrishnan, Arjun 03 January 2011 (has links)
Value of information is one of the key features of decision analysis. This work deals with providing a consistent and functional methodology to determine VOI on proposed well tests in the presence of uncertainties. This method strives to show that VOI analysis with the help of discretized versions of continuous probability distributions with conventional decision trees can be very accurate if the optimal method of discrete approximation is chosen rather than opting for methods such as Monte Carlo simulation to determine the VOI. This need not necessarily mean loss of accuracy at the cost of simplifying probability calculations. Both the prior and posterior probability distributions are assumed to be continuous and are discretized to find the VOI. This results in two steps of discretizations in the decision tree. Another interesting feature is that there lies a level of decision making between the two discrete approximations in the decision tree. This sets it apart from conventional discretized models since the accuracy in this case does not follow the rules and conventions that normal discrete models follow because of the decision between the two discrete approximations.
The initial part of the work deals with varying the number of points chosen in the discrete model to test their accuracy against different correlation coefficients between the information and the actual values. The latter part deals more with comparing different methods of existing discretization methods and establishing conditions under which each is optimal. The problem is comprehensively dealt with in the cases of both a risk neutral and a risk averse decision maker. / text
|
38 |
Valid estimation and prediction inference in analysis of a computer modelNagy, Béla 11 1900 (has links)
Computer models or simulators are becoming increasingly common in many fields in science and engineering, powered by the phenomenal growth in computer hardware over the
past decades. Many of these simulators implement a particular mathematical model as a deterministic computer code, meaning that running the simulator again with the same input gives the same output.
Often running the code involves some computationally expensive tasks, such as solving complex systems of partial differential equations numerically. When simulator runs become too long, it may limit their usefulness. In order to overcome time or budget constraints by making the most out of limited computational resources, a statistical methodology has been proposed, known as the "Design and Analysis of Computer Experiments".
The main idea is to run the expensive simulator only at a relatively few, carefully chosen design points in the input space, and based on the outputs construct an emulator (statistical model) that can emulate (predict) the output at new, untried
locations at a fraction of the cost. This approach is useful provided that we can measure how much the predictions of the cheap emulator deviate from the real response
surface of the original computer model.
One way to quantify emulator error is to construct pointwise prediction bands designed to envelope the response surface and make
assertions that the true response (simulator output) is enclosed by these envelopes with a certain probability. Of course, to be able
to make such probabilistic statements, one needs to introduce some kind of randomness. A common strategy that we use here is to model the computer code as a random function, also known as a Gaussian stochastic process. We concern ourselves with smooth response surfaces and use the Gaussian covariance function that is ideal in cases when the response function is infinitely differentiable.
In this thesis, we propose Fast Bayesian Inference (FBI) that is both computationally efficient and can be implemented as a black box. Simulation results show that it can achieve remarkably accurate prediction uncertainty assessments in terms of matching
coverage probabilities of the prediction bands and the associated reparameterizations can also help parameter uncertainty assessments.
|
39 |
Bayesian Phylogenetic Inference : Estimating Diversification Rates from Reconstructed PhylogeniesHöhna, Sebastian January 2013 (has links)
Phylogenetics is the study of the evolutionary relationship between species. Inference of phylogeny relies heavily on statistical models that have been extended and refined tremendously over the past years into very complex hierarchical models. Paper I introduces probabilistic graphical models to statistical phylogenetics and elaborates on the potential advantages a unified graphical model representation could have for the community, e.g., by facilitating communication and improving reproducibility of statistical analyses of phylogeny and evolution. Once the phylogeny is reconstructed it is possible to infer the rates of diversification (speciation and extinction). In this thesis I extend the birth-death process model, so that it can be applied to incompletely sampled phylogenies, that is, phylogenies of only a subsample of the presently living species from one group. Previous work only considered the case when every species had the same probability to be included and here I examine two alternative sampling schemes: diversified taxon sampling and cluster sampling. Paper II introduces these sampling schemes under a constant rate birth-death process and gives the probability density for reconstructed phylogenies. These models are extended in Paper IV to time-dependent diversification rates, again, under different sampling schemes and applied to empirical phylogenies. Paper III focuses on fast and unbiased simulations of reconstructed phylogenies. The efficiency is achieved by deriving the analytical distribution and density function of the speciation times in the reconstructed phylogeny. / <p>At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 1: Manuscript. Paper 4: Accepted.</p>
|
40 |
Parallel algorithms for generalized N-body problem in high dimensions and their applications for bayesian inference and image analysisXiao, Bo 12 January 2015 (has links)
In this dissertation, we explore parallel algorithms for general N-Body problems in high dimensions, and their applications in machine learning and image analysis on distributed infrastructures.
In the first part of this work, we proposed and developed a set of basic tools built on top of Message Passing Interface and OpenMP for massively parallel nearest neighbors search. In particular, we present a distributed tree structure to index data in arbitrary number of dimensions, and a novel algorithm that eliminate the need for collective coordinate exchanges during tree construction. To the best of our knowledge, our nearest neighbors package is the first attempt that scales to millions of cores in up to a thousand dimensions.
Based on our nearest neighbors search algorithms, we present "ASKIT", a parallel fast kernel summation tree code with a new near-far field decomposition and a new compact representation for the far field. Specially our algorithm is kernel independent. The efficiency of new near far decomposition depends only on the intrinsic dimensionality of data, and the new far field representation only relies on the rand of sub-blocks of the kernel matrix.
In the second part, we developed a Bayesian inference framework and a variational formulation for a MAP estimation of the label field for medical image segmentation. In particular, we propose new representations for both likelihood probability and prior probability functions, as well as their fast calculation. Then a parallel matrix free optimization algorithm is given to solve the MAP estimation. Our new prior function is suitable for lots of spatial inverse problems.
Experimental results show our framework is robust to noise, variations of shapes and artifacts.
|
Page generated in 0.0715 seconds