Global ETD Search

11	Independence Screening in High-Dimensional Data Wauters, John, Wauters, John January 2016 (has links) High-dimensional data, data in which the number of dimensions exceeds the number of observations, is increasingly common in statistics. The term "ultra-high dimensional" is defined by Fan and Lv (2008) as describing the situation where log(p) is of order O(na) for some a in the interval (0, ½). It arises in many contexts such as gene expression data, proteomic data, imaging data, tomography, and finance, as well as others. High-dimensional data present a challenge to traditional statistical techniques. In traditional statistical settings, models have a small number of features, chosen based on an assumption of what features may be relevant to the response of interest. In the high-dimensional setting, many of the techniques of traditional feature selection become computationally intractable, or does not yield unique solutions. Current research in modeling high-dimensional data is heavily focused on methods that screen the features before modeling; that is, methods that eliminate noise-features as a pre-modeling dimension reduction. Typically noise feature are identified by exploiting properties of independent random variables, thus the term "independence screening." There are methods for modeling high-dimensional data without feature screening first (e.g. LASSO or SCAD), but simulation studies show screen-first methods perform better as dimensionality increases. Many proposals for independence screening exist, but in my literature review certain themes recurred: A) The assumption of sparsity: that all the useful information in the data is actually contained in a small fraction of the features (the "active features"), the rest being essentially random noise (the "inactive" features). B) In many newer methods, initial dimension reduction by feature screening reduces the problem from the high-dimensional case to a classical case; feature selection then proceeds by a classical method. C) In the initial screening, removal of features independent of the response is highly desirable, as such features literally give no information about the response. D) For the initial screening, some statistic is applied pairwise to each feature in combination with the response; the specific statistic chosen so that in the case that the two random variables are independent, a specific known value is expected for the statistic. E) Features are ranked by the absolute difference between the calculated statistic and the expected value of that statistic in the independent case, i.e. features that are most different from the independent case are most preferred. F) Proof is typically offered that, asymptotically, the method retains the true active features with probability approaching one. G) Where possible, an iterative version of the process is explored, as iterative versions do much better at identifying features that are active in their interactions, but not active individually. feature screening high-dimensional data independence screening modeling dimension reduction
12	Some statistical methods for dimension reduction Al-Kenani, Ali J. Kadhim January 2013 (has links) The aim of the work in this thesis is to carry out dimension reduction (DR) for high dimensional (HD) data by using statistical methods for variable selection, feature extraction and a combination of the two. In Chapter 2, the DR is carried out through robust feature extraction. Robust canonical correlation (RCCA) methods have been proposed. In the correlation matrix of canonical correlation analysis (CCA), we suggest that the Pearson correlation should be substituted by robust correlation measures in order to obtain robust correlation matrices. These matrices have been employed for producing RCCA. Moreover, the classical covariance matrix has been substituted by robust estimators for multivariate location and dispersion in order to get RCCA. In Chapter 3 and 4, the DR is carried out by combining the ideas of variable selection using regularisation methods with feature extraction, through the minimum average variance estimator (MAVE) and single index quantile regression (SIQ) methods, respectively. In particular, we extend the sparse MAVE (SMAVE) reported in (Wang and Yin, 2008) by combining the MAVE loss function with different regularisation penalties in Chapter 3. An extension of the SIQ of Wu et al. (2010) by considering different regularisation penalties is proposed in Chapter 4. In Chapter 5, the DR is done through variable selection under Bayesian framework. A flexible Bayesian framework for regularisation in quantile regression (QR) model has been proposed. This work is different from Bayesian Lasso quantile regression (BLQR), employing the asymmetric Laplace error distribution (ALD). The error distribution is assumed to be an infinite mixture of Gaussian (IMG) densities. 519.5
13	Spectral Filtering for Spatio-temporal Dynamics and Multivariate Forecasts Meng, Lu January 2016 (has links) Due to the increasing availability of massive spatio-temporal data sets, modeling high dimensional data becomes quite challenging. A large number of research questions are rooted in identifying underlying dynamics in such spatio-temporal data. For many applications, the science suggests that the intrinsic dynamics be smooth and of low dimension. To reduce the variance of estimates and increase the computational tractability, dimension reduction is also quite necessary in the modeling procedure. In this dissertation, we propose a spectral filtering approach for dimension reduction and forecast amelioration, and apply it to multiple applications. We show the effectiveness of dimension reduction via our method and also illustrate its power for prediction in both simulation and real data examples. The resultant lower dimensional principal component series has a diagonal spectral density at each frequency whose diagonal elements are in descending order, which is not well motivated can be hard to interpret. Therefore we propose a phase-based filtering method to create principal component series with interpretable dynamics in the time domain. Our method is based on an approach of structural decomposition and phase-aligned construction in the frequency domain, identifying lower-rank dynamics and its components embedded in a high dimensional spatio-temporal system. In both our simulated examples and real data applications, we illustrate that the proposed method is able to separate and identify meaningful lower-rank movements. Benefiting from the zero-coherence property of the principal component series, we subsequently develop a predictive model for high-dimensional forecasting via lower-rank dynamics. Our modeling approach reduces multivariate modeling task to multiple univariate modeling and is flexible in combining with regularization techniques to obtain more stable estimates and improve interpretability. The simulation results and real data analysis show that our model achieves superior forecast performance compared to the class of autoregressive models. Mathematical statistics--Data processing Dynamics Dimension reduction (Statistics) Statistics
14	Structural Reformulations in System Identification Lyzell, Christian January 2012 (has links) In system identification, the choice of model structure is important and it is sometimes desirable to use a flexible model structure that is able to approximate a wide range of systems. One such model structure is the Wiener class of systems, that is, systems where the input enters a linear time-invariant subsystem followed by a time-invariant nonlinearity. Given a sequence of input and output pairs, the system identification problem is often formulated as the minimization of the mean-square prediction error. Here, the prediction error has a nonlinear dependence on the parameters of the linear subsystem and the nonlinearity. Unfortunately, this formulation of the estimation problem is often nonconvex, with several local minima, and it is therefore difficult to guarantee that a local search algorithm will be able to find the global optimum. In the first part of this thesis, we consider the application of dimension reduction methods to the problem of estimating the impulse response of the linear part of a system in the Wiener class. For example, by applying the inverse regression approach to dimension reduction, the impulse response estimation problem can be cast as a principal components problem, where the reformulation is based on simple nonparametric estimates of certain conditional moments. The inverse regression approach can be shown to be consistent under restrictions on the distribution of the input signal provided that the true linear subsystem has a finite impulse response. Furthermore, a forward approach to dimension reduction is also considered, where the time-invariant nonlinearity is approximated by a local linear model. In this setting, the impulse response estimation problem can be posed as a rank-reduced linear least-squares problem and a convex relaxation can be derived. Thereafter, we consider the extension of the subspace identification approach to include linear time-invariant rational models. It turns out that only minor structural modifications are needed and already available implementations can be used. Furthermore, other a priori information regarding the structure of the system can incorporated, including a certain class of linear gray-box structures. The proposed extension is not restricted to the discrete-time case and can be used to estimate continuous-time models. The final topic in this thesis is the estimation of discrete-time models containing polynomial nonlinearities. In the continuous-time case, a constructive algorithm based on differential algebra has previously been used to prove that such model structures are globally identifiable if and only if they can be written as a linear regression model. Thus, if we are able to transform the nonlinear model structure into a linear regression model, the parameter estimation problem can be solved with standard methods. Motivated by the above and the fact that most system identification problems involve sampled data, a discrete-time version of the algorithm is developed. This algorithm is closely related to the continuous-time version and enables the handling of noise signals without differentiations. System identification Dimension reduction Subspace identification Difference algebra
15	A clustering scheme for large high-dimensional document datasets Chen, Jing-wen 09 August 2007 (has links) Peoples pay more and more attention on document clustering methods. Because of the high dimension and the large number of data, clustering methods usually need a lot of time to calculate. We propose a scheme to make the clustering algorithm much faster then original. We partition the whole dataset to several parts. First, use one of these parts for clustering. Then according to the label after clustering, we reduce the number of features by a certain ratio. Add another part of data, convert these data to lower dimension and cluster them again. Repeat this until all partitions are used. According to the experimental result, this scheme may run twice faster then the original clustering method. Dimension reduction high-dimensional data clustering text mining Document clustering
16	Nonparametric Bayesian Models for Supervised Dimension Reduction and Regression Mao, Kai January 2009 (has links) <p>We propose nonparametric Bayesian models for supervised dimension</p><p>reduction and regression problems. Supervised dimension reduction is</p><p>a setting where one needs to reduce the dimensionality of the</p><p>predictors or find the dimension reduction subspace and lose little</p><p>or no predictive information. Our first method retrieves the</p><p>dimension reduction subspace in the inverse regression framework by</p><p>utilizing a dependent Dirichlet process that allows for natural</p><p>clustering for the data in terms of both the response and predictor</p><p>variables. Our second method is based on ideas from the gradient</p><p>learning framework and retrieves the dimension reduction subspace</p><p>through coherent nonparametric Bayesian kernel models. We also</p><p>discuss and provide a new rationalization of kernel regression based</p><p>on nonparametric Bayesian models allowing for direct and formal</p><p>inference on the uncertain regression functions. Our proposed models</p><p>apply for high dimensional cases where the number of variables far</p><p>exceed the sample size, and hold for both the classical setting of</p><p>Euclidean subspaces and the Riemannian setting where the marginal</p><p>distribution is concentrated on a manifold. Our Bayesian perspective</p><p>adds appropriate probabilistic and statistical frameworks that allow</p><p>for rich inference such as uncertainty estimation which is important</p><p>for measuring the estimates. Formal probabilistic models with</p><p>likelihoods and priors are given and efficient posterior sampling</p><p>can be obtained by Markov chain Monte Carlo methodologies,</p><p>particularly Gibbs sampling schemes. For the supervised dimension</p><p>reduction as the posterior draws are linear subspaces which are</p><p>points on a Grassmann manifold, we do the posterior inference with</p><p>respect to geodesics on the Grassmannian. The utility of our</p><p>approaches is illustrated on simulated and real examples.</p> / Dissertation Statistics Dirichlet process Kernel models Nonparametric Bayesian Supervised dimension reduction
17	PROBABILISTIC PREDICTION USING EMBEDDED RANDOM PROJECTIONS OF HIGH DIMENSIONAL DATA Kurwitz, Richard C. 2009 May 1900 (has links) The explosive growth of digital data collection and processing demands a new approach to the historical engineering methods of data correlation and model creation. A new prediction methodology based on high dimensional data has been developed. Since most high dimensional data resides on a low dimensional manifold, the new prediction methodology is one of dimensional reduction with embedding into a diffusion space that allows optimal distribution along the manifold. The resulting data manifold space is then used to produce a probability density function which uses spatial weighting to influence predictions i.e. data nearer the query have greater importance than data further away. The methodology also allows data of differing phenomenology e.g. color, shape, temperature, etc to be handled by regression or clustering classification. The new methodology is first developed, validated, then applied to common engineering situations, such as critical heat flux prediction and shuttle pitch angle determination. A number of illustrative examples are given with a significant focus placed on the objective identification of two-phase flow regimes. It is shown that the new methodology is robust through accurate predictions with even a small number of data points in the diffusion space as well as flexible in the ability to handle a wide range of engineering problems.
18	Visual hierarchical dimension reduction Yang, Jing. January 2002 (has links) Thesis (M.S.)--Worcester Polytechnic Institute. / Keywords: hierarchy; sunburst; dimension reduction; high dimensional data set; multidimensional visualization; parallel coordinates; scatterplot matrices; star glyphs. Includes bibliographical references (p. 86-91).
19	Dimension Reduction for Model-based Clustering via Mixtures of Multivariate t-Distributions Morris, Katherine 21 August 2012 (has links) We introduce a dimension reduction method for model-based clustering obtained from a finite mixture of t-distributions. This approach is based on existing work on reducing dimensionality in the case of finite Gaussian mixtures. The method relies on identifying a reduced subspace of the data by considering how much group means and group covariances vary. This subspace contains linear combinations of the original data, which are ordered by importance via the associated eigenvalues. Observations can be projected onto the subspace and the resulting set of variables captures most of the clustering structure available in the data. The approach is illustrated using simulated and real data. / Paul McNicholas
20	Adapting Component Analysis Dorri, Fatemeh January 2012 (has links) A main problem in machine learning is to predict the response variables of a test set given the training data and its corresponding response variables. A predictive model can perform satisfactorily only if the training data is an appropriate representative of the test data. This intuition is re???ected in the assumption that the training data and the test data are drawn from the same underlying distribution. However, the assumption may not be correct in many applications for various reasons. For example, gathering training data from the test population might not be easily possible, due to its expense or rareness. Or, factors like time, place, weather, etc can cause the difference in the distributions. I propose a method based on kernel distribution embedding and Hilbert Schmidt Independence Criteria (HSIC) to address this problem. The proposed method explores a new representation of the data in a new feature space with two properties: (i) the distributions of the training and the test data sets are as close as possible in the new feature space, (ii) the important structural information of the data is preserved. The algorithm can reduce the dimensionality of the data while it preserves the aforementioned properties and therefore it can be seen as a dimensionality reduction method as well. Our method has a closed-form solution and the experimental results on various data sets show that it works well in practice. Domain adaptation Kernel embedding Hilbert-Schmidt Independence Criteria Dimension reduction

Search results