• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 86
  • 9
  • 9
  • 5
  • 4
  • 4
  • 2
  • 1
  • 1
  • Tagged with
  • 148
  • 148
  • 39
  • 37
  • 36
  • 22
  • 20
  • 18
  • 18
  • 17
  • 17
  • 17
  • 16
  • 15
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

On error bounds for linear feature extraction /

Thangavelu, Madan Kumar. January 1900 (has links)
Thesis (M.S.)--Oregon State University, 2010. / Printout. Includes bibliographical references (leaves 67-71). Also available on the World Wide Web.
2

Immunologically amplified knowledge and intentions dimensionality reduction in cooperative multi-agent systems

Coulter, Duncan Anthony 08 October 2014 (has links)
Ph.D. (Computer Science) / The development of software systems is a relatively recent field of human endeavour. Even so, it has followed a steady progression of dominant paradigms which have incrementally improved the ease with which developers are able to express the logic and structure of their systems. The initially unstructured era of free-form spaghetti code gave way to structured programming in which the entry and exit points of functional units were well defined through the creation of abstractions such as procedures, sub-routines and functions. The problem of correctly associating data with the set of operations which are legal on this data was addressed through the concept of encapsulation with the onset of object-oriented programming. Object orientation also introduced a set of abstractions for safe code reuse through inheritance and dynamic polymorphism as well as composition/aggregation and delegation. The agent-oriented software development paradigm, when viewed as an extension of object orientation, adds the capacity of agent autonomy to an object, which allows it to select for itself which of its operations it will execute at any point in time. In addition, the separation between an agent and the environment within which it is embedded must be well defined. Agent autonomy allows for the modelling and development of loosely coupled systems with the capacity for complex emergent behaviour. The mapping of a given set of environmental percepts to an agent's operation selection defines its agent function and hence its emergent behaviour. Furthermore, agents may also be embedded into a shared environment together with other agents forming a multi-agent system. The emergent characteristics of such systems are defined not only through changes in environment state but also via agent to agent interactions. Multi-agent systems are categorised into cooperative or competitive based on whether all the agents within the system share a common goal. An argument is presented that even within cooperative multi-agent systems selfishness will emerge as a direct consequence of computational intractability. The core of the argument centres on the finite nature of the computational resources available to an agent which must be divided between the evaluation of the usefulness of other agent's knowledge and intentions towards improving the collective utility of the system and directly acting upon its own. As a direct result of the halting problem it is impossible for an agent to ascertain in general whether another agent's plans are even feasible (i.e. will result in the system reaching a goal state). As a direct consequence of such a limitation agents will in general favour their own courses of action over those of others and hence an emergent selfishness occurs even in ostensibly cooperative systems...
3

Study of Single and Ensemble Machine Learning Models on Credit Data to Detect Underlying Non-performing Loans

Li, Qiongzhu January 2016 (has links)
In this paper, we try to compare the performance of two feature dimension reduction methods, the LASSO and PCA. Both simulation study and empirical study show that the LASSO is superior to PCA when selecting significant variables. We apply Logistics Regression (LR), Artificial Neural Network (ANN), Support Vector Machine (SVM), Decision Tree (DT) and their corresponding ensemble machines constructed by bagging and adaptive boosting (adaboost) in our study. Three experiments are conducted to explore the impact of class-unbalanced data set on all models. Empirical study indicates that when the percentage of performing loans exceeds 83.3%, the training models shall be carefully applied. When we have class-balanced data set, ensemble machines indeed have a better performance over single machines. The weaker the single machine, the more obvious the improvement we can observe.
4

Dimension Reduction and LASSO using Pointwise and Group Norms

Jutras, Melanie A 11 December 2018 (has links)
Principal Components Analysis (PCA) is a statistical procedure commonly used for the purpose of analyzing high dimensional data. It is often used for dimensionality reduction, which is accomplished by determining orthogonal components that contribute most to the underlying variance of the data. While PCA is widely used for identifying patterns and capturing variability of data in lower dimensions, it has some known limitations. In particular, PCA represents its results as linear combinations of data attributes. PCA is therefore, often seen as difficult to interpret and because of the underlying optimization problem that is being solved it is not robust to outliers. In this thesis, we examine extensions to PCA that address these limitations. Specific techniques researched in this thesis include variations of Robust and Sparse PCA as well as novel combinations of these two methods which result in a structured low-rank approximation that is robust to outliers. Our work is inspired by the well known machine learning methods of Least Absolute Shrinkage and Selection Operator (LASSO) as well as pointwise and group matrix norms. Practical applications including robust and non-linear methods for anomaly detection in Domain Name System network data as well as interpretable feature selection with respect to a website classification problem are discussed along with implementation details and techniques for analysis of regularization parameters.
5

Dimension Reduction and Clustering of High Dimensional Data using a Mixture of Generalized Hyperbolic Distributions

Pathmanathan, Thinesh January 2018 (has links)
Model-based clustering is a probabilistic approach that views each cluster as a component in an appropriate mixture model. The Gaussian mixture model is one of the most widely used model-based methods. However, this model tends to perform poorly when clustering high-dimensional data due to the over-parametrized solutions that arise in high-dimensional spaces. This work instead considers the approach of combining dimension reduction techniques with clustering via a mixture of generalized hyperbolic distributions. The dimension reduction techniques, principal component analysis and factor analysis along with their extensions were reviewed. Then the aforementioned dimension reduction techniques were individually paired with the mixture of generalized hyperbolic distributions in order to demonstrate the clustering performance achieved under each method using both simulated and real data sets. For a majority of the data sets, the clustering method utilizing principal component analysis exhibited better classi cation results compared to the clustering method based on the extending the factor analysis model. / Thesis / Master of Science (MSc)
6

Explainable Interactive Projections for Image Data

Han, Huimin 12 January 2023 (has links)
Making sense of large collections of images is difficult. Dimension reductions (DR) assist by organizing images in a 2D space based on similarities, but provide little support for explaining why images were placed together or apart in the 2D space. Additionally, they do not provide support for modifying and updating the 2D space to explore new relationships and organizations of images. To address these problems, we present an interactive DR method for images that uses visual features extracted by a deep neural network to project the images into 2D space and provides visual explanations of image features that contributed to the 2D location. In addition, it allows people to directly manipulate the 2D projection space to define alternative relationships and explore subsequent projections of the images. With an iterative cycle of semantic interaction and explainable-AI feedback, people can explore complex visual relationships in image data. Our approach to human-AI interaction integrates visual knowledge from both human mental models and pre-trained deep neural models to explore image data. Two usage scenarios are provided to demonstrate that our method is able to capture human feedback and incorporate it into the model. Our visual explanations help bridge the gap between the feature space and the original images to illustrate the knowledge learned by the model, creating a synergy between human and machine that facilitates a more complete analysis experience. / Master of Science / High-dimensional data is everywhere. A spreadsheet with many columns, text documents, images, ... ,etc. Exploring and visualizing high-dimensional data can be challenging. Dimension reduction (DR) techniques can help. High dimensional data can be projected into 3d or 2d space and visualized as a scatter plot.Additionally, DR tool can be interactive to help users better explore data and understand underlying algorithms. Designing such interactive DR tool is challenging for images. To address this problem, this thesis presents a tool that can visualize images to a 2D plot, data points that are considered similar are projected close to each other and vice versa. Users can manipulate images directly on this scatterplot-like visualization based on own knowledge to update the display, saliency maps are provided to reflect model's re-projection reasoning.
7

Dimension Reduction and Clustering for Interactive Visual Analytics

Wenskovitch Jr, John Edward 06 September 2019 (has links)
When exploring large, high-dimensional datasets, analysts often utilize two techniques for reducing the data to make exploration more tractable. The first technique, dimension reduction, reduces the high-dimensional dataset into a low-dimensional space while preserving high-dimensional structures. The second, clustering, groups similar observations while simultaneously separating dissimilar observations. Existing work presents a number of systems and approaches that utilize these techniques; however, these techniques can cooperate or conflict in unexpected ways. The core contribution of this work is the systematic examination of the design space at the intersection of dimension reduction and clustering when building intelligent, interactive tools in visual analytics. I survey existing techniques for dimension reduction and clustering algorithms in visual analytics tools, and I explore the design space for creating projections and interactions that include dimension reduction and clustering algorithms in the same visual interface. Further, I implement and evaluate three prototype tools that implement specific points within this design space. Finally, I run a cognitive study to understand how analysts perform dimension reduction (spatialization) and clustering (grouping) operations. Contributions of this work include surveys of existing techniques, three interactive tools and usage cases demonstrating their utility, design decisions for implementing future tools, and a presentation of complex human organizational behaviors. / Doctor of Philosophy / When an analyst is exploring a dataset, they seek to gain insight from the data. With data sets growing larger, analysts require techniques to help them reduce the size of the data while still maintaining its meaning. Two commonly-utilized techniques are dimension reduction and clustering. Dimension reduction seeks to eliminate unnecessary features from the data, reducing the number of columns to a smaller number. Clustering seeks to group similar objects together, reducing the number of rows to a smaller number. The contribution of this work is to explore how dimension reduction and clustering are currently being used in interactive visual analytics systems, as well as to explore how they could be used to address challenges faced by analysts in the future. To do so, I survey existing techniques and explore the design space for creating visualizations that incorporate both types of computations. I look at methods by which an analyst could interact with those projections in other to communicate their interests to the system, thereby producing visualizations that better match the needs of the analyst. I develop and evaluate three tools that incorporate both dimension reduction and clustering in separate computational pipelines. Finally, I conduct a cognitive study to better understand how users think about these operations, in order to create guidelines for better systems in the future.
8

On Sufficient Dimension Reduction via Asymmetric Least Squares

Soale, Abdul-Nasah, 0000-0003-2093-7645 January 2021 (has links)
Accompanying the advances in computer technology is an increase collection of high dimensional data in many scientific and social studies. Sufficient dimension reduction (SDR) is a statistical method that enable us to reduce the dimension ofpredictors without loss of regression information. In this dissertation, we introduce principal asymmetric least squares (PALS) as a unified framework for linear and nonlinear sufficient dimension reduction. Classical methods such as sliced inverse regression (Li, 1991) and principal support vector machines (Li, Artemiou and Li, 2011) often do not perform well in the presence of heteroscedastic error, while our proposal addresses this limitation by synthesizing different expectile levels. Through extensive numerical studies, we demonstrate the superior performance of PALS in terms of both computation time and estimation accuracy. For the asymptotic analysis of PALS for linear sufficient dimension reduction, we develop new tools to compute the derivative of an expectation of a non-Lipschitz function. PALS is not designed to handle symmetric link function between the response and the predictors. As a remedy, we develop expectile-assisted inverse regression estimation (EA-IRE) as a unified framework for moment-based inverse regression. We propose to first estimate the expectiles through kernel expectile regression, and then carry out dimension reduction based on random projections of the regression expectiles. Several popular inverse regression methods in the literature including slice inverse regression, slice average variance estimation, and directional regression are extended under this general framework. The proposed expectile-assisted methods outperform existing moment-based dimension reduction methods in both numerical studies and an analysis of the Big Mac data. / Statistics
9

Sufficient Dimension Reduction in Complex Datasets

Yang, Chaozheng January 2016 (has links)
This dissertation focuses on two problems in dimension reduction. One is using permutation approach to test predictor contribution. The permutation approach applies to marginal coordinate tests based on dimension reduction methods such as SIR, SAVE and DR. This approach no longer requires calculation of the method-specific weights to determine the asymptotic null distribution. The other one is through combining clustering method with robust regression (least absolute deviation) to estimate dimension reduction subspace. Compared with ordinary least squares, the proposed method is more robust to outliers; also, this method replaces the global linearity assumption with the more flexible local linearity assumption through k-means clustering. / Statistics
10

Mixture models for clustering and dimension reduction

Verbeek, Jakob 08 December 2004 (has links) (PDF)
In Chapter 1 we give a general introduction and motivate the need for clustering and dimension reduction methods. We continue in Chapther 2 with a review of different types of existing clustering and dimension reduction methods.<br /><br />In Chapter 3 we introduce mixture densities and the expectation-maximization (EM) algorithm to estimate their parameters. Although the EM algorithm has many attractive properties, it is not guaranteed to return optimal parameter estimates. We present greedy EM parameter estimation algorithms which start with a one-component mixture and then iteratively add a component to the mixture and re-estimate the parameters of the current mixture. Experimentally, we demonstrate that our algorithms avoid many of the sub-optimal estimates returned by the EM algorithm. Finally, we present an approach to accelerate mixture densities estimation from many data points. We apply this approach to both the standard EM algorithm and our greedy EM algorithm.<br /><br />In Chapter 4 we present a non-linear dimension reduction method that uses a constrained EM algorithm for parameter estimation. Our approach is similar to Kohonen's self-organizing map, but in contrast to the self-organizing map, our parameter estimation algorithm is guaranteed to converge and optimizes a well-defined objective function. In addition, our method allows data with missing values to be used for parameter estimation and it is readily applied to data that is not specified by real numbers but for example by discrete variables. We present the results of several experiments to demonstrate our method and to compare it with Kohonen's self-organizing map.<br /><br />In Chapter 5 we consider an approach for non-linear dimension reduction which is based on a combination of clustering and linear dimension reduction. This approach forms one global non-linear low dimensional data representation by combining multiple, locally valid, linear low dimensional representations. We derive an improvement of the original parameter estimation algorithm, which requires less computation and leads to better parameter estimates. We experimentally compare this approach to several other dimension reduction methods. We also apply this approach to a setting where high dimensional 'outputs' have to be predicted from high dimensional 'inputs'. Experimentally, we show that the considered non-linear approach leads to better predictions than a similar approach which also combines several local linear representations, but does not combine them into one global non-linear representation.<br /><br />In Chapter 6 we summarize our conclusions and discuss directions for further research.

Page generated in 0.1321 seconds