1 |
Approximate replication of high-breakdown robust regression techniquesZeileis, Achim, Kleiber, Christian January 2008 (has links) (PDF)
This paper demonstrates that even regression results obtained by techniques close to the standard ordinary least squares (OLS) method can be difficult to replicate if a stochastic model fitting algorithm is employed. / Series: Research Report Series / Department of Statistics and Mathematics
|
2 |
Regression Models for Count Data in RZeileis, Achim, Kleiber, Christian, Jackman, Simon January 2007 (has links) (PDF)
The classical Poisson, geometric and negative binomial regression models for count data belong to the family of generalized linear models and are available at the core of the statistics toolbox in the R system for statistical computing. After reviewing the conceptual and computational features of these methods, a new implementation of zero-inflated and hurdle regression models in the functions zeroinfl() and hurdle() from the package pscl is introduced. It re-uses design and functionality of the basic R functions just as the underlying conceptual tools extend the classical models. Both model classes are able to incorporate over-dispersion and excess zeros - two problems that typically occur in count data sets in economics and the social and political sciences - better than their classical counterparts. Using cross-section data on the demand for medical care, it is illustrated how the classical as well as the zero-augmented models can be fitted, inspected and tested in practice. (author's abstract) / Series: Research Report Series / Department of Statistics and Mathematics
|
3 |
Exchange Rate Regime Analysis Using Structural Change MethodsZeileis, Achim, Shah, Ajay, Patnaik, Ila January 2007 (has links) (PDF)
Regression models for de facto currency regime classification are complemented by inferential techniques for tracking the stability of exchange rate regimes. Several structural change methods are adapted to these regressions: tools for assessing the stability of exchange rate regressions in historical data (testing), in incoming data (monitoring) and for determining the breakpoints of shifts in the exchange rate regime (dating). The tools are illustrated by investigating the Chinese exchange rate regime after China gave up on a fixed exchange rate to the US dollar in 2005 and to track the evolution of the Indian exchange rate regime since 1993. / Series: Research Report Series / Department of Statistics and Mathematics
|
4 |
Advanced Regression Methods in Finance and Economics: Three EssaysHofmarcher, Paul 29 March 2012 (has links) (PDF)
In this thesis advanced regression methods are applied to discuss and investigate highly relevant research questions in the areas of finance and economics. In the field of credit risk the thesis investigates a hierarchical
model which allows to obtain a consensus score, if several
ratings are available for each firm. Autoregressive processes and random effects are used to model both a correlation structure between and within the obligors in the sample. The model also allows to validate
the raters themselves. The problem of model uncertainty and multicollinearity between the explanatory variables is addressed in the other two applications. Penalized
regressions, like bridge regressions, are used to handle multicollinearity while model averaging techniques allow to account for model uncertainty. The second part of the thesis makes use of Bayesian elastic nets and Bayesian Model Averaging (BMA) techniques to discuss
long-term economic growth. It identifies variables which are
significantly related to long-term growth. Additionally, it illustrates the superiority of this approach in terms of predictive accuracy. Finally, the third part combines ridge regressions with BMA to identify macroeconomic variables which are significantly related to aggregated firm failure rates. The estimated results deliver important insights for
e.g., stress-test scenarios. (author's abstract)
|
5 |
Building a Data Mining Framework for Target MarketingMarch, Nicolas 05 1900 (has links) (PDF)
Most retailers and scientists agree that supporting the buying decisions of individual customers or groups of customers with specific product recommendations holds great promise. Target-oriented promotional campaigns are more profitable in comparison to uniform methods of sale promotion such as discount pricing campaigns. This seems to be particulary true if the promoted products are well matched to the preferences of the customers or customer groups. But how can retailers identify customer groups and determine which products to offer them? To answer this question, this dissertation describes an algorithmic procedure which identifies customer groups with similar preferences for specific product combinations in recorded transaction data. In addition, for each customer group it recommends products which promise higher sales through cross-selling if appropriate promotion techniques are applied. To illustrate the application of this algorithmic approach, an analysis is performed on the transaction database of a supermarket. The identified customer groups are used for a simulation. The results show that appropriate promotional campaigns which implement this algorithmic approach can achieve an increase in profit from 15% to as much as 191% in contrast to uniform discounts on the purchase price of bestsellers. (author's abstract)
|
6 |
Efficiency Analysis of European Freight Villages-Three Peers for BenchmarkingYang, Congcong, Taudes, Alfred, Dong, Guozhi January 2015 (has links) (PDF)
Measuring the performance of Freight Villages (FVs) has important implications for logistics companies and other related companies as well as governments. In this paper we apply Data Envelopment Analysis (DEA) to measure the performance of European FVs in a purely data-driven way incorporating the nature of FVs as complex operations that use multiple inputs and produce several outputs. We employ several DEA models and perform a complete sensitivity analysis of the appropriateness of the chosen input and output variables, and an assessment of the robustness of the efficiency score. It turns out that about half of the 20 FVs analyzed are inefficient, with utilization of the intermodal area and warehouse capacity and level of goods handed the being the most important areas of improvement. While we find no significant differences in efficiency between FVs of different sizes and in different countries, it turns out that the FVs Eurocentre Toulouse, Interporto Quadrante Europa and GVZ Nürnberg constitute more than 90% of the benchmark share. / Series: Working Papers on Information Systems, Information Business and Operations
|
7 |
Essays on Modern Econometrics and Machine LearningKeilbar, Georg 16 June 2022 (has links)
Diese Dissertation behandelt verschiedene Aspekte moderner Ökonometrie und Machine Learnings. Kapitel 2 stellt einen neuen Schätzer für die Regressionsparameter in einem Paneldatenmodell mit interaktiven festen Effekten vor. Eine Besonderheit unserer Methode ist die Modellierung der factor loadings durch nichtparametrische Funktionen. Wir zeigen die root-NT-Konvergenz sowie die asymptotische Normalverteilung unseres Schätzers. Kapitel 3 betrachtet die rekursive Schätzung von Quantilen mit Hilfe des stochastic gradient descent (SGD) Algorithmus mit Polyak-Ruppert Mittelwertbildung. Der Algorithmus ist rechnerisch und Speicher-effizient verglichen mit herkömmlichen Schätzmethoden. Unser Fokus ist die Untersuchung des nichtasymptotischen Verhaltens, indem wir eine exponentielle Wahrscheinlichkeitsungleichung zeigen. In Kapitel 4 stellen wir eine neue Methode zur Kalibrierung von conditional Value-at-Risk (CoVaR) basierend auf Quantilregression mittels Neural Networks vor. Wir modellieren systemische Spillovereffekte in einem Netzwerk von systemrelevanten Finanzinstituten. Eine Out-of-Sample Analyse zeigt eine klare Verbesserung im Vergleich zu einer linearen Grundspezifikation. Im Vergleich mit bestehenden Risikomaßen eröffnet unsere Methode eine neue Perspektive auf systemisches Risiko. In Kapitel 5 modellieren wir die gemeinsame Dynamik von Kryptowährungen in einem nicht-stationären Kontext. Um eine Analyse in einem dynamischen Rahmen zu ermöglichen, stellen wir eine neue vector error correction model (VECM) Spezifikation vor, die wir COINtensity VECM nennen. / This thesis focuses on different aspects of the union of modern econometrics and machine learning. Chapter 2 considers a new estimator of the regression parameters in a panel data model with unobservable interactive fixed effects. A distinctive feature of the proposed approach is to model the factor loadings as a nonparametric function. We show that our estimator is root-NT-consistent and asymptotically normal, as well that it reaches the semiparametric efficiency bound under the assumption of i.i.d. errors.
Chapter 3 is concerned with the recursive estimation of quantiles using the stochastic gradient descent (SGD) algorithm with Polyak-Ruppert averaging. The algorithm offers a computationally and memory efficient alternative to the usual empirical estimator. Our focus is on studying the nonasymptotic behavior by providing exponentially decreasing tail probability bounds under minimal assumptions. In Chapter 4 we propose a novel approach to calibrate the conditional value-at-risk (CoVaR) of financial institutions based on neural network quantile regression. We model systemic risk spillover effects in a network context across banks by considering the marginal effects of the quantile regression procedure. An out-of-sample analysis shows great performance compared to a linear baseline specification, signifying the importance that nonlinearity plays for modelling systemic risk. A comparison to existing network-based risk measures reveals that our approach offers a new perspective on systemic risk. In Chapter 5 we aim to model the joint dynamics of cryptocurrencies in a nonstationary setting. In particular, we analyze the role of cointegration relationships within a large system of cryptocurrencies in a vector error correction model (VECM) framework. To enable analysis in a dynamic setting, we propose the COINtensity VECM, a nonlinear VECM specification accounting for a varying system-wide cointegration exposure.
|
8 |
Functional data analysis with applications in financeBenko, Michal 26 January 2007 (has links)
An vielen verschiedenen Stellen der angewandten Statistik sind die zu untersuchenden Objekte abhängig von stetigen Parametern. Typische Beispiele in Finanzmarktapplikationen sind implizierte Volatilitäten, risikoneutrale Dichten oder Zinskurven. Aufgrund der Marktkonventionen sowie weiteren technisch bedingten Gründen sind diese Objekte nur an diskreten Punkten, wie zum Beispiel an Ausübungspreise und Maturitäten, für die ein Geschäft in einem bestimmten Zeitraum abgeschlossen wurde, beobachtbar. Ein funktionaler Datensatz ist dann vorhanden, wenn diese Funktionen für verschiedene Zeitpunkte (z.B. Tage) oder verschiedene zugrundeliegende Aktiva gesammelt werden. Das erste Thema, das in dieser Dissertation betrachtet wird, behandelt die nichtparametrischen Methoden der Schätzung dieser Objekte (wie z.B. implizierte Volatilitäten) aus den beobachteten Daten. Neben den bekannten Glättungsmethoden wird eine Prozedur für die Glättung der implizierten Volatilitäten vorgeschlagen, die auf einer Kombination von nichtparametrischer Glättung und den Ergebnissen der arbitragefreien Theorie basiert. Der zweite Teil der Dissertation ist der funktionalen Datenanalyse (FDA), speziell im Zusammenhang mit den Problemen, der empirischen Finanzmarktanalyse gewidmet. Der theoretische Teil der Arbeit konzentriert sich auf die funktionale Hauptkomponentenanalyse -- das funktionale Ebenbild der bekannten Dimensionsreduktionstechnik. Ein umfangreicher überblick der existierenden Methoden wird gegeben, eine Schätzmethode, die von der Lösung des dualen Problems motiviert ist und die Zwei-Stichproben-Inferenz basierend auf der funktionalen Hauptkomponentenanalyse werden behandelt. Die FDA-Techniken sind auf die Analyse der implizierten Volatilitäten- und Zinskurvendynamik angewandt worden. Darüber hinaus, wird die Implementation der FDA-Techniken zusammen mit einer FDA-Bibliothek für die statistische Software Xplore behandelt. / In many different fields of applied statistics an object of interest is depending on some continuous parameter. Typical examples in finance are implied volatility functions, yield curves or risk-neutral densities. Due to the different market conventions and further technical reasons, these objects are observable only on a discrete grid, e.g. for a grid of strikes and maturities for which the trade has been settled at a given time-point. By collecting these functions for several time points (e.g. days) or for different underlyings, a bunch (sample) of functions is obtained - a functional data set. The first topic considered in this thesis concerns the strategies of recovering the functional objects (e.g. implied volatilities function) from the observed data based on the nonparametric smoothing methods. Besides the standard smoothing methods, a procedure based on a combination of nonparametric smoothing and the no-arbitrage-theory results is proposed for implied volatility smoothing. The second part of the thesis is devoted to the functional data analysis (FDA) and its connection to the problems present in the empirical analysis of the financial markets. The theoretical part of the thesis focuses on the functional principal components analysis -- functional counterpart of the well known multivariate dimension-reduction-technique. A comprehensive overview of the existing methods is given, an estimation method based on the dual problem as well as the two-sample inference based on the functional principal component analysis are discussed. The FDA techniques are applied to the analysis of the implied volatility and yield curve dynamics. In addition, the implementation of the FDA techniques together with a FDA library for the statistical environment XploRe are presented.
|
9 |
DirichletReg: Dirichlet Regression for Compositional Data in RMaier, Marco J. 18 January 2014 (has links) (PDF)
Dirichlet regression models can be used to analyze a set of variables lying
in a bounded interval that sum up to a constant (e.g., proportions, rates,
compositions, etc.) exhibiting skewness and heteroscedasticity, without
having to transform the data.
There are two parametrization for the presented model, one using the common
Dirichlet distribution's alpha parameters, and a reparametrization of the
alpha's to set up a mean-and-dispersion-like model.
By applying appropriate link-functions, a GLM-like framework is set up that
allows for the analysis of such data in a straightforward and familiar way,
because interpretation is similar to multinomial logistic regression.
This paper gives a brief theoretical foundation and describes the
implementation as well as application (including worked examples) of
Dirichlet regression methods implemented in the package DirichletReg (Maier,
2013) in the R language (R Core Team, 2013). (author's abstract) / Series: Research Report Series / Department of Statistics and Mathematics
|
10 |
Flexible Regression for Different Types of Multivariate Functional DataVolkmann, Alexander 14 October 2024 (has links)
In dieser Dissertation werden neue Regressionsansätze für multivariate longitudinale oder funktionale Daten entwickelt, die eine flexible Modellierung von interessierenden Kovariableneffekten ermöglichen. Die Abhängigkeit innerhalb und zwischen den verschiedenen Zielgrößen wird über latente multivariate Gauss-Prozesse modelliert. Die Regressionsansätze folgen einem zweistufigen Verfahren, in dem in einem vorgelagerten Schritt multivariate funktionale Hauptkomponentenanalysen eingesetzt werden, um sparsame empirische Basen für die Gauss-Prozesse zu konstruieren. Drei verschiedene Regressionsmodelle werden für verschiedene Arten multivariater longitudinaler oder funktionaler Daten entwickelt. Das erste Projekt führt das zweistufige Verfahren für multivariate normalverteilte funktionale Daten ein, die eine gekreutzte oder genestete Datenstruktur aufweisen können. Das Regressionsmodell ist im frequentistischen Rahmenmodell der funktionalen additiven gemischten Modelle eingebettet und wird durch Anwendungen auf Bewegungsdaten und Sprachdaten illustriert. Das zweite Projekt entwickelt ein bayesianisches Regressionsgerüst für multilevel multivariate funktionale Daten, die verschiedenen punktweisen Verteilungen folgen. Das erlaubt es, verschiedene Datentypen, wie etwa binäre, Zähl- oder kontinuierliche funktionale Daten gleichzeitig zu modellieren, was durch eine Anwendung auf Berliner Verkehrsdaten veranschaulicht wird. Das dritte Projekt vereint multivariate longitudinale normalverteilte Daten mit einer Ereigniszeit-Zielgröße in einem gemeinsamen bayesianischen Modellierungsansatz. Solche Modelle werden oft im medizinischen Bereich verwendet, beispielsweise wenn der Fokus der Analyse auf der Schätzung der Assoziation zwischen longitudinalen Messungen von Biomarkern und dem Überleben von Patienten mit chronischen Lebererkrankung liegt. / In this thesis, novel regression approaches for multivariate longitudinal or functional data are developed, which allow to flexibly model the covariate effects of interest. The dependency within and between the different outcomes is modeled using latent multivariate Gaussian processes. The regression approaches adopt a two-step procedure where, in a preliminary step, multivariate functional principal component analyses are employed to generate parsimonious empirical bases for the Gaussian processes. Three different regression models are developed for different types of longitudinal or multivariate functional data. The first project establishes the two-step procedure for multivariate Gaussian functional data which can exhibit a crossed or nested multilevel structure. The regression model is embedded in the frequentist functional additive mixed model framework and is demonstrated by applications in movement data and speech production data. The second project develops a Bayesian regression framework for multilevel multivariate functional data that follow different pointwise distributions. This allows to simultaneously model data of different types such as binary, count, or continuous functional data, which is illustrated by an application to Berlin traffic data. The third project combines multivariate longitudinal Gaussian data with a time-to-event outcome in a Bayesian joint modelling approach. Such models are often used in medical contexts where the main point of interest lies in estimating the association between longitudinal measurements of biomarkers and e.g. the survival of patients as in the presented application to a chronic liver disease. All projects are accompanied by simulation studies to assess the estimation accuracy and the models' limitations.
|
Page generated in 0.0227 seconds