• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Spatially Correlated Model Selection (SCOMS)

Velasco-Cruz, Ciro 31 May 2012 (has links)
In this dissertation, a variable selection method for spatial data is developed. It is assumed that the spatial process is non-stationary as a whole but is piece-wise stationary. The pieces where the spatial process is stationary are called regions. The variable selection approach accounts for two sources of correlation: (1) the spatial correlation of the data within the regions, and (2) the correlation of adjacent regions. The variable selection is carried out by including indicator variables that characterize the significance of the regression coefficients. The Ising distribution as prior for the vector of indicator variables, models the dependence of adjacent regions. We present a case study on brook trout data where the response of interest is the presence/absence of the fish at sites in the eastern United States. We find that the method outperforms the case of the probit regression where the spatial field is assumed stationary and isotropic. Additionally, the method outperformed the case where multiple regions are assumed independent of their neighbors. / Ph. D.
2

Statistical Analysis of Structured High-dimensional Data

Sun, Yizhi 05 October 2018 (has links)
High-dimensional data such as multi-modal neuroimaging data and large-scale networks carry excessive amount of information, and can be used to test various scientific hypotheses or discover important patterns in complicated systems. While considerable efforts have been made to analyze high-dimensional data, existing approaches often rely on simple summaries which could miss important information, and many challenges on modeling complex structures in data remain unaddressed. In this proposal, we focus on analyzing structured high-dimensional data, including functional data with important local regions and network data with community structures. The first part of this dissertation concerns the detection of ``important'' regions in functional data. We propose a novel Bayesian approach that enables region selection in the functional data regression framework. The selection of regions is achieved through encouraging sparse estimation of the regression coefficient, where nonzero regions correspond to regions that are selected. To achieve sparse estimation, we adopt compactly supported and potentially over-complete basis to capture local features of the regression coefficient function, and assume a spike-slab prior to the coefficients of the bases functions. To encourage continuous shrinkage of nearby regions, we assume an Ising hyper-prior which takes into account the neighboring structure of the bases functions. This neighboring structure is represented by an undirected graph. We perform posterior sampling through Markov chain Monte Carlo algorithms. The practical performance of the proposed approach is demonstrated through simulations as well as near-infrared and sonar data. The second part of this dissertation focuses on constructing diversified portfolios using stock return data in the Center for Research in Security Prices (CRSP) database maintained by the University of Chicago. Diversification is a risk management strategy that involves mixing a variety of financial assets in a portfolio. This strategy helps reduce the overall risk of the investment and improve performance of the portfolio. To construct portfolios that effectively diversify risks, we first construct a co-movement network using the correlations between stock returns over a training time period. Correlation characterizes the synchrony among stock returns thus helps us understand whether two or multiple stocks have common risk attributes. Based on the co-movement network, we apply multiple network community detection algorithms to detect groups of stocks with common co-movement patterns. Stocks within the same community tend to be highly correlated, while stocks across different communities tend to be less correlated. A portfolio is then constructed by selecting stocks from different communities. The average return of the constructed portfolio over a testing time period is finally compared with the SandP 500 market index. Our constructed portfolios demonstrate outstanding performance during a non-crisis period (2004-2006) and good performance during a financial crisis period (2008-2010). / PHD / High dimensional data, which are composed by data points with a tremendous number of features (a.k.a. attributes, independent variables, explanatory variables), brings challenges to statistical analysis due to their “high-dimensionality” and complicated structure. In this dissertation work, I consider two types of high-dimension data. The first type is functional data in which each observation is a function. The second type is network data whose internal structure can be described as a network. I aim to detect “important” regions in functional data by using a novel statistical model, and I treat stock market data as network data to construct quality portfolios efficiently

Page generated in 0.0602 seconds