41 |
Machine Learning Techniques for Large-Scale System ModelingLv, Jiaqing 31 August 2011 (has links)
This thesis is about some issues in system modeling: The first is a parsimonious
representation of MISO Hammerstein system, which is by projecting the multivariate
linear function into a univariate input function space. This leads to the so-called
semiparamtric Hammerstein model, which overcomes the commonly known “Curse
of dimensionality” for nonparametric estimation on MISO systems. The second issue
discussed in this thesis is orthogonal expansion analysis on a univariate Hammerstein
model and hypothesis testing for the structure of the nonlinear subsystem. The generalization
of this technique can be used to test the validity for parametric assumptions
of the nonlinear function in Hammersteim models. It can also be applied to approximate
a general nonlinear function by a certain class of parametric function in the
Hammerstein models. These techniques can also be extended to other block-oriented
systems, e.g, Wiener systems, with slight modification. The third issue in this thesis is
applying machine learning and system modeling techniques to transient stability studies
in power engineering. The simultaneous variable section and estimation lead to a
substantially reduced complexity and yet possesses a stronger prediction power than
techniques known in the power engineering literature so far.
|
42 |
Second-order least squares estimation in dynamic regression modelsAbdelAziz Salamh, Mustafa 16 April 2014 (has links)
In this dissertation we proposed two generalizations of the Second-Order Least Squares (SLS) approach in two popular dynamic econometrics models. The first one is the regression model with time varying nonlinear mean function and autoregressive conditionally heteroskedastic (ARCH) disturbances. The second one is a linear dynamic panel data model.
We used a semiparametric framework in both models where the SLS approach is based only on the first two conditional moments of response variable given the explanatory variables. There is no need to specify the distribution of the error components in both models. For the ARCH model under the assumption of strong-mixing process with finite moments of some order, we established the strong consistency and asymptotic normality of the SLS estimator.
It is shown that the optimal SLS estimator, which makes use of the additional information inherent in the conditional skewness and kurtosis of the process, is superior to the commonly used quasi-MLE, and the efficiency gain is significant when the underlying distribution is asymmetric. Moreover, our large scale simulation studies showed that the optimal SLSE behaves better than the corresponding estimating function estimator in finite sample situation. The practical usefulness of the optimal SLSE was tested by an empirical example on the U.K. Inflation. For the linear dynamic panel data model, we showed that the SLS estimator is consistent and asymptotically normal for large N and finite T under fairly general regularity conditions. Moreover, we showed that the optimal SLS estimator reaches a semiparametric efficiency bound. A specification test was developed for the first time to be used whenever the SLS is applied to real data. Our Monte Carlo simulations showed that the optimal SLS estimator performs satisfactorily in finite sample situations compared to the first-differenced GMM and the random effects pseudo ML estimators. The results apply under stationary/nonstationary process and wih/out exogenous regressors. The performance of the optimal SLS is robust under near-unit root case. Finally, the practical usefulness of the optimal SLSE was examined by an empirical study on the U.S. airfares.
|
43 |
Monotonic and Semiparametric Regression for the Detection of Trends in Environmental Quality DataHussian, Mohamed January 2005 (has links)
Natural fluctuations in the state of the environment can long conceal or distort important trends in the human impact on our ecosystems. Accordingly, there is increasing interest in statistical normalisation techniques that can clarify the anthropogenic effects by removing meteorologically driven fluctuations and other natural variation in time series of environmental quality data. This thesis shows that semi- and nonparametric regression methods can provide effective tools for applying such normalisation to collected data. In particular, it is demonstrated how monotonic regression can be utilised in this context. A new numerical algorithm for this type of regression can accommodate two or more discrete or continuous explanatory variables, which enables simultaneous estimation of a monotonic temporal trend and correction for one or more covariates that have a monotonic relationship with the response variable under consideration. To illustrate the method, a case study of mercury levels in fish is presented, using body length and weight as covariates. Semiparametric regression techniques enable trend analyses in which a nonparametric representation of temporal trends is combined with parametrically modelled corrections for covariates. Here, it is described how such models can be employed to extract trends from data collected over several seasons, and this procedure is exemplified by discussing how temporal trends in the load of nutrients carried by the Elbe River can be detected while adjusting for water discharge and other factors. In addition, it is shown how semiparametric models can be used for joint normalisation of several time series of data.
|
44 |
Semiparametric Methods for the Analysis of Progression-Related EndpointsBoruvka, Audrey January 2013 (has links)
Use of progression-free survival in the evaluation of clinical interventions is hampered by a variety of issues, including censoring patterns not addressed in the usual methods for survival analysis. Progression can be right-censored before survival or interval-censored between inspection times. Current practice calls for imputing events to their time of detection. Such an approach is prone to bias, underestimates standard errors and makes inefficient use of the data at hand. Moreover a composite outcome prevents inference about the actual treatment effect on the risk of progression. This thesis develops semiparametric and sieve maximum likelihood estimators to more formally analyze progression-related endpoints. For the special case where death rarely precedes progression, a Cox-Aalen model is proposed for regression analysis of time-to-progression under intermittent inspection. The general setting considering both progression and survival is examined with a Markov Cox-type illness-death model under various censoring schemes. All of the resulting estimators globally converge to the truth slower than the parametric rate, but their finite-dimensional components are asymptotically efficient. Numerical studies suggest that the new methods perform better than their imputation-based alternatives under moderate to large samples having higher rates of censoring.
|
45 |
Machine Learning Techniques for Large-Scale System ModelingLv, Jiaqing 31 August 2011 (has links)
This thesis is about some issues in system modeling: The first is a parsimonious
representation of MISO Hammerstein system, which is by projecting the multivariate
linear function into a univariate input function space. This leads to the so-called
semiparamtric Hammerstein model, which overcomes the commonly known “Curse
of dimensionality” for nonparametric estimation on MISO systems. The second issue
discussed in this thesis is orthogonal expansion analysis on a univariate Hammerstein
model and hypothesis testing for the structure of the nonlinear subsystem. The generalization
of this technique can be used to test the validity for parametric assumptions
of the nonlinear function in Hammersteim models. It can also be applied to approximate
a general nonlinear function by a certain class of parametric function in the
Hammerstein models. These techniques can also be extended to other block-oriented
systems, e.g, Wiener systems, with slight modification. The third issue in this thesis is
applying machine learning and system modeling techniques to transient stability studies
in power engineering. The simultaneous variable section and estimation lead to a
substantially reduced complexity and yet possesses a stronger prediction power than
techniques known in the power engineering literature so far.
|
46 |
Efficient Semiparametric Estimators for Nonlinear Regressions and Models under Sample Selection BiasKim, Mi Jeong 2012 August 1900 (has links)
We study the consistency, robustness and efficiency of parameter estimation in different but related models via semiparametric approach. First, we revisit the second- order least squares estimator proposed in Wang and Leblanc (2008) and show that the estimator reaches the semiparametric efficiency. We further extend the method to the heteroscedastic error models and propose a semiparametric efficient estimator in this more general setting. Second, we study a class of semiparametric skewed distributions arising when the sample selection process causes sampling bias for the observations. We begin by assuming the anti-symmetric property to the skewing function. Taking into account the symmetric nature of the population distribution, we propose consistent estimators for the center of the symmetric population. These estimators are robust to model misspecification and reach the minimum possible estimation variance. Next, we extend the model to permit a more flexible skewing structure. Without assuming a particular form of the skewing function, we propose both consistent and efficient estimators for the center of the symmetric population using a semiparametric method. We also analyze the asymptotic properties and derive the corresponding inference procedures. Numerical results are provided to support the results and illustrate the finite sample performance of the proposed estimators.
|
47 |
A avaliação do impacto de um treinamento utilizando Propensity Score Matching : uma abordagem não-paramétrica e semiparamétricaSilveira, Luiz Felipe de Vasconcellos January 2015 (has links)
O objetivo dessa dissertação é avaliar o impacto de um programa de treinamento voltado para trabalhadores, utilizando o propensity score matching, mas com dois tipos de abordagem, uma não-paramétrica e a outra semi-paramétrica. Para estimação não paramétrica foi utilizado um método proposto por Li, Racine e Wooldridge (2009) e para estimação semi-paramétrica, o modelo utilizado foi o Generalized Additive Model proposto por Hastie e Tibshirani (1990). Os resultados obtidos indicam que os dois métodos utilizados apresentam estimativas tão boas ou melhores do que quando estimadas paramétricamente. / The goal of this thesis is to evaluate the impact of a job training program using propensity score matching methods with two types of approaches: a nonparametric e another semiparametric. For non-parametric estimation was used a method proposed by Li, Racine and Wooldridge (2009) and for the semiparametric model the Generalized Additive Model proposed by Hastie and Tibshirani (1990). The results indicate that both methods provide estimates as good or better than when parametrically estimated.
|
48 |
A avaliação do impacto de um treinamento utilizando Propensity Score Matching : uma abordagem não-paramétrica e semiparamétricaSilveira, Luiz Felipe de Vasconcellos January 2015 (has links)
O objetivo dessa dissertação é avaliar o impacto de um programa de treinamento voltado para trabalhadores, utilizando o propensity score matching, mas com dois tipos de abordagem, uma não-paramétrica e a outra semi-paramétrica. Para estimação não paramétrica foi utilizado um método proposto por Li, Racine e Wooldridge (2009) e para estimação semi-paramétrica, o modelo utilizado foi o Generalized Additive Model proposto por Hastie e Tibshirani (1990). Os resultados obtidos indicam que os dois métodos utilizados apresentam estimativas tão boas ou melhores do que quando estimadas paramétricamente. / The goal of this thesis is to evaluate the impact of a job training program using propensity score matching methods with two types of approaches: a nonparametric e another semiparametric. For non-parametric estimation was used a method proposed by Li, Racine and Wooldridge (2009) and for the semiparametric model the Generalized Additive Model proposed by Hastie and Tibshirani (1990). The results indicate that both methods provide estimates as good or better than when parametrically estimated.
|
49 |
Spline-based sieve semiparametric generalized estimating equation for panel count dataHua, Lei 01 May 2010 (has links)
In this thesis, we propose to analyze panel count data using a spline-based
sieve generalized estimating equation method with a semiparametric proportional mean model E(N(t)|Z) = Λ0(t) eβT0Z. The natural log of the baseline mean function, logΛ0(t), is approximated by a monotone cubic B-spline function. The estimates of regression parameters and spline coefficients are the roots of the spline based sieve generalized estimating equations (sieve GEE). The proposed method avoids assumingany parametric structure of the baseline mean function and the underlying counting process. Selection of an appropriate covariance matrix that represents the true correlation between the cumulative counts improves estimating efficiency.
In addition to the parameters existing in the proportional mean function, the estimation that accounts for the over-dispersion and autocorrelation involves an extra nuisance parameter σ2, which could be estimated using a method of moment proposed by Zeger (1988). The parameters in the mean function are then estimated by solving the pseudo generalized estimating equation with σ2 replaced by its estimate, σ2n. We show that the estimate of (β0,Λ0) based on this two-stage approach is still consistent and could converge at the optimal convergence rate in the nonparametric/semiparametric regression setting. The asymptotic normality of the estimate of β0 is also established. We further propose a spline-based projection variance estimating method and show its consistency.
Simulation studies are conducted to investigate finite sample performance of the sieve semiparametric GEE estimates, as well as different variance estimating methods with different sample sizes. The covariance matrix that accounts for the overdispersion generally increases estimating efficiency when overdispersion is present in the data. Finally, the proposed method with different covariance matrices is applied to a real data from a bladder tumor clinical trial.
|
50 |
Essays in Macroeconomics and Finance:Hu, Yushan January 2020 (has links)
Thesis advisor: Fabio Schiantarelli / Thesis advisor: Zhijie Xiao / This dissertation consists of three essays in macroeconomics and finance. The first and second chapters analyze the impact of the financial shocks and anti-corruption campaign on Chinese firms through the bank lending channel. The third chapter provides a new method to predict the cash flow from operations (CFO) via semi-parametric estimation and machine learning. The first chapter explores the impact of the financial crisis and sovereign debt crisis on Chinese firms through the bank lending channel and firm borrowing channel. Using new data linking Chinese firms to their bank(s) and four different measurements of exposure to the international markets (international borrowing, importance of lending to foreign listed companies, share of trade settlement, and exchange/income), I find that banks with higher exposure to the international markets cut lending more during the recent financial crisis. In addition, state-owned bank loans are more pro-cyclical compared with private bank loans. Moreover, banks with higher exposure to the international markets cut lending more when there is a negative shock in OECD GDP growth. With regard to firm borrowing channel, I find that firms with higher weighted aggregate exposure to the international markets through banks have lower net debt, cash, employment, and capital investment during the financial crisis. Firms with higher weighted aggregate exposure to the global markets have higher net debt and lower cash, employment, and capital investment when there is a negative shock in OECD GDP growth. This paper also provides a theoretical model to explain the mechanism in a partially opened economy like China. The second chapter discusses the impact of the anti-corruption campaign on Chinese firms through the bank lending channel. Using confidential data linking Chinese firms to their bank(s) and prefecture-level corruption index, I find that banks located in more corrupted prefectures offer significantly less credits before the anti-corruption investigation, and this effect changes the direction after the investigation. Moreover, banks located in more corrupted prefectures tend to use higher interest rates, longer maturity, and more collateral before the campaign, all of these effects change the direction after the campaign. This paper suggests that the banks located in more corrupted prefectures have stronger monopoly power (or higher markup, and lower efficiency). This monopoly effect could be proved by that the bank concentration ratio is higher, and the bad loans of the banks are higher in the more corrupted areas, and all of these effects disappear after the campaign. The third chapter considers the methods of prediction of Cash flow from operations (CFO). Forecasting CFO is an essential topic in financial econometrics and empirical accounting. It impacts a variety of economic decisions, including valuation methodologies employing discounted cash flows, distress prediction, risk assessment, the accuracy of credit-rating predictions, and the provision of value-relevant information to security markets. Existing literature on statistically-based cash-flow prediction has pursued cross-sectional versus time-series estimation procedures in a mutually exclusive fashion. Cumulated empirical evidence indicates that the beta value varies across firms of different sizes, and the cross-sectional regression can not capture an idiosyncratic beta. However, although a time series based predictive model has the advantage of allowing for firm-specific variability in beta, it requires a long enough time series data. In this paper, we extend the literature on statistically-based, cash-flow prediction models by introducing an estimation procedure that, in essence, combine the favorable attributes of both cross-sectional estimation via the use of "local" cross-sectional data for firms of similar size and time-series estimation via the capturing of firm-specific variability in the beta parameters for the independent variables. The local learning approach assumes no a priori knowledge on the constancy of the beta coefficient. It allows the information about coefficients to be represented by only a subset of observations. This feature is particularly relevant in the CFO model, where the beta values are only related to cross-sectional data information that is "local" to its size. We provide empirical evidence that the prediction of cash flows from operations is enhanced by jointly adopting features specific to both cross-sectional and time-series modeling simultaneously. / Thesis (PhD) — Boston College, 2020. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Economics.
|
Page generated in 0.1317 seconds