Global ETD Search

101	An Empirical Study of Novel Approaches to Dimensionality Reduction and Applications Nsang, Augustine S. 23 September 2011 (has links) No description available. Computer Science dimensionality reduction random projections clustering classification queries web data
102	Generalized Principal Component Analysis: Dimensionality Reduction through the Projection of Natural Parameters Landgraf, Andrew J. 15 October 2015 (has links) No description available. Statistics Binary data Count data Dimensionality reduction Exponential family Logistic PCA Principal component analysis
103	Machine-Based Interpretation and Classification of Image-Derived Features: Applications in Digital Pathology and Multi-Parametric MRI of Prostate Cancer Ginsburg, Shoshana 31 May 2016 (has links) No description available. Biomedical Engineering Medical Imaging Radiology Prostate cancer MRI computer-aided diagnosis dimensionality reduction
104	Organization of Electronic Dance Music by Dimensionality Reduction / Organisering av Elektronisk Dans Musik genom Dimensionsreducering Tideman, Victor January 2022 (has links) This thesis aims to produce a similarity metric for tracks of the genre: Electronic Dance Music, by taking a high-dimensional data representation of each track and then project it to a low-dimensional embedded space (2D and 3D) by applying two Dimensionality Reduction (DR) techniques called t-distributed stochastic neighbor embedding (t-SNE) and Pairwise Controlled Manifold Approximation (PaCMAP). A content-based approach is taken to identify similarity, which is defined as the distances between points in the embedded space. This work strives to explore the connection between the extractable content and the feel of a track. Features are extracted from every track over a 30 second window with Digital Signal Processing tools. Three evaluation methods were conducted with the purpose of establishing ground truth in the data. The first evaluation method established expected similarity sub clusters and tuned the DR techniques until the expected clusters appeared in the visualisations of the embedded space. The second evaluation method attempted to generate new tracks with a controlled level of separation by applying various distortion techniques with increasing magnitude to copies of a track. The third evaluation method introduces a data set with annotated scores on valence and arousal values of music snippets which was used to train estimators that was used to estimate the feeling of tracks and to perform classification. Lastly, a similarity metric was computed based on distances in the embedded space. Findings suggest that certain contextual groups such as remixes and tracks by the same artist, can be identified with this metric and that tracks with small distortions (similar tracks) are located more closely in the embedded space than tracks with large distortions. Dimensionality Reduction Digital Signal Processing Similarity Music Music Musik Signal Processing Signalbehandling Computer Systems Datorsystem
105	Spatial-spectral analysis in dimensionality reduction for hyperspectral image classification Shah, Chiranjibi 13 May 2022 (has links) This dissertation develops new algorithms with different techniques in utilizing spatial and spectral information for hyperspectral image classification. It is necessary to perform spatial and spectral analysis and conduct dimensionality reduction (DR) for effective feature extraction, because hyperspectral imagery consists of a large number of spatial pixels along with hundreds of spectral dimensions. In the first proposed method, it employs spatial-aware collaboration-competition preserving graph embedding by imposing a spatial regularization term along with Tikhonov regularization in the objective function for DR of hyperspectral imagery. Moreover, Collaboration representation (CR) is an efficient classifier but without using spatial information. Thus, structure-aware collaborative representation (SaCRT) is introduced to utilize spatial information for more appropriate data representations. It is demonstrated that better classification performance can be offered by the SaCRT in this work. For DR, collaborative and low-rank representation-based graph for discriminant analysis of hyperspectral imagery is proposed. It can generate a more informative graph by combining collaborative and low-rank representation terms. With the collaborative term, it can incorporate within-class atoms. Meanwhile, it can preserve global data structure by use of the low-rank term. Since it employs a collaborative term in the estimation of representation coefficients, its closed-form solution results in less computational complexity in comparison to sparse representation. The proposed collaborative and low-rank representation-based graph can outperform the existing sparse and low-rank representation-based graph for DR of hyperspectral imagery. The concept of tree-based techniques and deep neural networks can be combined by use of an interpretable canonical deep tabular data learning architecture (TabNet). It uses sequential attention for choosing appropriate features at different decision steps. An efficient TabNet for hyperspectral image classification is developed in this dissertation, in which the performance of TabNet is enhanced by incorporating a 2-D convolution layer inside an attentive transformer. Additionally, better classification performance of TabNet can be obtained by utilizing structure profiles on TabNet. Signal Processing
106	Toward a General Novelty Detection Framework in Structural Health Monitoring; Challenges and Opportunities in Deep Learning Soleimani-Babakamali, Mohammad Hesam 17 October 2022 (has links) Structural health monitoring (SHM) is an anomaly detection process. Data-driven SHM has gained much attention compared to the model-based strategy, specifically with the current state-of-the-art machine learning routines. Model-based methods require structural information, time-consuming model updating, and may fail with noisy data, a persistent condition in real-time SHM problems. However, there are several hindrances in supervised and unsupervised settings in machine learning-based SHM. This study identifies and addresses such hindrances with the versatility of state-of-the-art deep learning strategies. While managing those complications, we aim at proposing a general, structure-independent (ie requires no prior information) SHM framework. Developing such techniques plays a crucial role in the SHM of smart cities. In the supervised SHM and sensor output validation (SOV) category, data class imbalance results from the lack of data from nuanced structural states. Employing Long Short-Term Memory (LSTM) units, we developed a general technique that manages both SHM and SOV. The developed architecture accepts high-dimensional features, enabling the train of Generative Adversarial Networks for data generation, addressing the complications of data imbalance. GAN-generated SHM data improved accuracy for low-sampled classes from 44.77% to 64.58% and from 73.39% to 90.84% in two SOV and SHM case studies, respectively. Arguing the unsupervised SHM as a practical category since it identifies novelties (ie unseen states), the current application of dimensionality reduction (DR) in unsupervised SHM is investigated. Due to the curse of dimensionality, classical unsupervised techniques cannot function with high-dimensional features, driving the use of DR techniques. Investigations highlighted the importance of avoiding DR in unsupervised SHM, as data dimensions that DR suppresses may contain damage-sensitive features for novelties. With DR, novelty detection accuracy declined up to 60% in two benchmark SHM datasets. Other obstacles in the unsupervised SHM area are case-dependent features, lack of dynamic-class novelty detection, and the impact of user-defined detection parameters on novelty detection accuracy. We chose the fast Fourier transform-based (FFT) of raw signals with no dimensionality reduction to develop the SHM framework. A deep neural network scheme is developed to perform the pattern recognition of that high-dimensional data. The framework does not require prior information, with GAN models implemented, offering robustness to sensor placement in structures. These characteristics make the framework suitable for developing general unsupervised SHM techniques. / Doctor of Philosophy / Detecting abnormal behaviors in structures from the input signals of sensors is called Structural health monitoring (SHM). The vibrational characteristics of signals or direct pattern recognition techniques can be applied to detect anomalies in a data-driven scheme. Machine learning (ML) tools are suitable for data-driven methods; However, challenges exist on both supervised and unsupervised ML-based SHM. Recent improvements in deep learning are employed in this study to address such obstacles after their identification. In supervised learning, the data points for the normal state of structures are abundant, and datasets are usually imbalanced, which is the same issue for the sensor output validation (SOV). SOV must be present before SHM takes place to remove anomalous sensor outputs. First, a unified decision-making system for SHM and SOV problems is proposed, and then data imbalance is alleviated by generating new data objects from low-sampled classes. The proposed unified method is based on the recurrent neural networks, and the generation mechanism is Generative Adversarial Network (GAN). Results indicate improvements in accuracy metrics for data classes in the minority. For the unsupervised SHM, four major issues are identified, including data loss during feature extraction, case-dependency of such extraction schemes. These two issues are solved with the fast Fourier transform (FFT) of signals to be the features with no reduction in their dimensionality. The other obstacles are the lack of discrimination between different novel classes (ie only two classes of damage and undamaged) and the effect of the detection parameters, defined by users, on the SHM analysis. The latter two predicaments are also addressed by online generating new data objects from the incoming signal stream with GAN and tuning the detection system to have the same performance regarding user-defined parameters regarding GAN-generated data. The proposed unsupervised technique is further improved to be insensitive to the sensor placement on structures by employing recurrent neural network layers in the GAN architecture, with the GAN that has overfitted discriminator. Structural health monitoring Reliability Analysis Dimensionality reduction Generative Adversarial Networks Sensor output validation
107	Dimensionality Reduction, Feature Selection and Visualization of Biological Data Ha, Sook Shin 14 September 2012 (has links) Due to the high dimensionality of most biological data, it is a difficult task to directly analyze, model and visualize the data to gain biological insight. Thus, dimensionality reduction becomes an imperative pre-processing step in analyzing and visualizing high-dimensional biological data. Two major approaches to dimensionality reduction in genomic analysis and biomarker identification studies are: Feature extraction, creating new features by combining existing ones based on a mapping technique; and feature selection, choosing an optimal subset of all features based on an objective function. In this dissertation, we show how our innovative reduction schemes effectively reduce the dimensionality of DNA gene expression data to extract biologically interpretable and relevant features which result in enhancing the biomarker identification process. To construct biologically interpretable features and facilitate Muscular Dystrophy (MD) subtypes classification, we extract molecular features from MD microarray data by constructing sub-networks using a novel integrative scheme which utilizes protein-protein interaction (PPI) network, functional gene sets information and mRNA profiling data. The workflow includes three major steps: First, by combining PPI network structure and gene-gene co-expression relationship into a new distance metric, we apply affinity propagation clustering (APC) to build gene sub-networks; secondly, we further incorporate functional gene sets knowledge to complement the physical interaction information; finally, based on the constructed sub-network and gene set features, we apply multi-class support vector machine (MSVM) for MD sub-type classification and highlight the biomarkers contributing to the sub-type prediction. The experimental results show that our scheme could construct sub-networks that are more relevant to MD than those constructed by the conventional approach. Furthermore, our integrative strategy substantially improved the prediction accuracy, especially for those â€˜hard-to-classify' sub-types. Conventionally, pathway-based analysis assumes that genes in a pathway equally contribute to a biological function, thus assigning uniform weight to genes. However, this assumption has been proven incorrect and applying uniform weight in the pathway analysis may not be an adequate approach for tasks like molecular classification of diseases, as genes in a functional group may have different differential power. Hence, we propose to use different weights for the pathway analysis which resulted in the development of four weighting schemes. We applied them in two existing pathway analysis methods using both real and simulated gene expression data for pathways. Weighting changes pathway scoring and brings up some new significant pathways, leading to the detection of disease-related genes that are missed under uniform weight. To help us understand our MD expression data better and derive scientific insight from it, we have explored a suite of visualization tools. Particularly, for selected top performing MD sub-networks, we displayed the network view using Cytoscape; functional annotations using IPA and DAVID functional analysis tools; expression pattern using heat-map and parallel coordinates plot; and MD associated pathways using KEGG pathway diagrams. We also performed weighted MD pathway analysis, and identified overlapping sub-networks across different weight schemes and different MD subtypes using Venn Diagrams, which resulted in the identification of a new sub-network significantly associated with MD. All those graphically displayed data and information helped us understand our MD data and the MD subtypes better, resulting in the identification of several potentially MD associated biomarker pathways and genes. / Ph. D. Gene Expression Feature Selection Dimensionality Reduction PPI network Pathways Visualization Weight
108	On the Effectiveness of Dimensionality Reduction for Unsupervised Structural Health Monitoring Anomaly Detection Soleimani-Babakamali, Mohammad Hesam 19 April 2022 (has links) Dimensionality reduction techniques (DR) enhance data interpretability and reduce space complexity, though at the cost of information loss. Such methods have been prevalent in the Structural Health Monitoring (SHM) anomaly detection literature. While DR is favorable in supervised anomaly detection, where possible novelties are known a priori, the efficacy is less clear in unsupervised detection. In this work, we perform a detailed assessment of the DR performance trade-offs to determine whether the information loss imposed by DR can impact SHM performance for previously unseen novelties. As a basis for our analysis, we rely on an SHM anomaly detection method operating on input signals' fast Fourier transform (FFT). FFT is regarded as a raw, frequency-domain feature that allows studying various DR techniques. We design extensive experiments comparing various DR techniques, including neural autoencoder models, to capture the impact on two SHM benchmark datasets exclusively. Results imply the loss of information to be more detrimental, reducing the novelty detection accuracy by up to 60\% with autoencoder-based DR. Regularization can alleviate some of the challenges though unpredictable. Dimensions of substantial vibrational information mostly survive DR; thus, the regularization impact suggests that these dimensions are not reliable damage-sensitive features regarding unseen faults. Consequently, we argue that designing new SHM anomaly detection methods that can work with high-dimensional raw features is a necessary research direction and present open challenges and future directions. / M.S. / Structural health monitoring (SHM) aids the timely maintenance of infrastructures, saving human lives and natural resources. Infrastructure will undergo unseen damages in the future. Thus, data-driven SHM techniques for handling unlabeled data (i.e., unsupervised learning) are suitable for real-world usage. Lacking labels and defined data classes, data instances are categorized through similarities, i.e., distances. Still, distance metrics in high-dimensional spaces can become meaningless. As a result, applying methods to reduce data dimensions is currently practiced, yet, at the cost of information loss. Naturally, a trade-off exists between the loss of information and the increased interpretability of low-dimensional spaces induced by dimensionality reduction procedures. This study proposes an unsupervised SHM technique that works with low and high-dimensional data to assess that trade-off. Results show the negative impacts of dimensionality reduction to be more severe than its benefits. Developing unsupervised SHM methods with raw data is thus encouraged for real-world applications. Unsupervised SHM Generative Adversarial Networks Anomaly Detection Dimensionality Reduction Autoencoder Regularization
109	Contributions to High-Dimensional Pattern Recognition Villegas Santamaría, Mauricio 20 May 2011 (has links) This thesis gathers some contributions to statistical pattern recognition particularly targeted at problems in which the feature vectors are high-dimensional. Three pattern recognition scenarios are addressed, namely pattern classification, regression analysis and score fusion. For each of these, an algorithm for learning a statistical model is presented. In order to address the difficulty that is encountered when the feature vectors are high-dimensional, adequate models and objective functions are defined. The strategy of learning simultaneously a dimensionality reduction function and the pattern recognition model parameters is shown to be quite effective, making it possible to learn the model without discarding any discriminative information. Another topic that is addressed in the thesis is the use of tangent vectors as a way to take better advantage of the available training data. Using this idea, two popular discriminative dimensionality reduction techniques are shown to be effectively improved. For each of the algorithms proposed throughout the thesis, several data sets are used to illustrate the properties and the performance of the approaches. The empirical results show that the proposed techniques perform considerably well, and furthermore the models learned tend to be very computationally efficient. / Villegas Santamaría, M. (2011). Contributions to High-Dimensional Pattern Recognition [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/10939 Pattern recognition Dimensionality reduction Classification Regression Ranking LENGUAJES Y SISTEMAS INFORMATICOS
110	Anomaly Detection in Time Series Data using Unsupervised Machine Learning Methods: A Clustering-Based Approach / Anomalidetektering av tidsseriedata med hjälp av oövervakad maskininlärningsmetoder: En klusterbaserad tillvägagångssätt Hanna, Peter, Swartling, Erik January 2020 (has links) For many companies in the manufacturing industry, attempts to find damages in their products is a vital process, especially during the production phase. Since applying different machine learning techniques can further aid the process of damage identification, it becomes a popular choice among companies to make use of these methods to enhance the production process even further. For some industries, damage identification can be heavily linked with anomaly detection of different measurements. In this thesis, the aim is to construct unsupervised machine learning models to identify anomalies on unlabeled measurements of pumps using high frequency sampled current and voltage time series data. The measurement can be split up into five different phases, namely the startup phase, three duty point phases and lastly the shutdown phase. The approach is based on clustering methods, where the main algorithms of use are the density-based algorithms DBSCAN and LOF. Dimensionality reduction techniques, such as feature extraction and feature selection, are applied to the data and after constructing the five models of each phase, it can be seen that the models identifies anomalies in the data set given. / För flera företag i tillverkningsindustrin är felsökningar av produkter en fundamental uppgift i produktionsprocessen. Då användningen av olika maskininlärningsmetoder visar sig innehålla användbara tekniker för att hitta fel i produkter är dessa metoder ett populärt val bland företag som ytterligare vill förbättra produktionprocessen. För vissa industrier är feldetektering starkt kopplat till anomalidetektering av olika mätningar. I detta examensarbete är syftet att konstruera oövervakad maskininlärningsmodeller för att identifiera anomalier i tidsseriedata. Mer specifikt består datan av högfrekvent mätdata av pumpar via ström och spänningsmätningar. Mätningarna består av fem olika faser, nämligen uppstartsfasen, tre last-faser och fasen för avstängning. Maskinilärningsmetoderna är baserade på olika klustertekniker, och de metoderna som användes är DBSCAN och LOF algoritmerna. Dessutom tillämpades olika dimensionsreduktionstekniker och efter att ha konstruerat 5 olika modeller, alltså en för varje fas, kan det konstateras att modellerna lyckats identifiera anomalier i det givna datasetet. Anomaly detection unsupervised machine learning high frequency sampled time series clustering dimensionality reduction DBSCAN LOF Anomaly detection unsupervised machine learning high frequency sampled time series clustering dimensionality reduction DBSCAN LOF Probability Theory and Statistics Sannolikhetsteori och statistik

Search results