This thesis covers four topics: i) Measuring dependence in time series through distance covariance; ii) Testing goodness-of-fit of time series models; iii) Threshold selection for multivariate heavy-tailed data; and iv) Inference for linear preferential attachment networks.
Topic i) studies a dependence measure based on characteristic functions, called distance covariance, in time series settings. Distance covariance recently gathered popularity for its ability to detect nonlinear dependence. In particular, we characterize a general family of such dependence measures and use them to measure lagged serial and cross dependence in stationary time series. Assuming strong mixing, we establish the relevant asymptotic theory for the sample auto- and cross- distance correlation functions.
Topic ii) proposes a goodness-of-fit test for general classes of time series model by applying the auto-distance covariance function (ADCV) to the fitted residuals. Under the correct model assumption, the limit distribution for the ADCV of the residuals differs from that of an i.i.d. sequence by a correction term. This adjustment has essentially the same form regardless of the model specification.
Topic iii) considers data in the multivariate regular varying setting where the radial part $R$ is asymptotically independent of the angular part $\Theta$ as $R$ goes to infinity. The goal is to estimate the limiting distribution of $\Theta$ given $R\to\infty$, which characterizes the tail dependence of the data. A typical strategy is to look at the angular components of the data for which the radial parts exceed some threshold. We propose an algorithm to select the threshold based on distance covariance statistics and a subsampling scheme.
Topic iv) investigates inference questions related to the linear preferential attachment model for network data. Preferential attachment is an appealing mechanism based on the intuition “the rich get richer” and produces the well-observed power-law behavior in net- works. We provide methods for fitting such a model under two data scenarios, when the network formation is given, and when only a single-time snapshot of the network is observed.
Identifer | oai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/D8Q25GQB |
Date | January 2018 |
Creators | Wan, Phyllis |
Source Sets | Columbia University |
Language | English |
Detected Language | English |
Type | Theses |
Page generated in 0.002 seconds