1 |
Reducing the dimensionality of hyperspectral remotely sensed data with applications for maximum likelihood image classificationSantich, Norman Ty January 2007 (has links)
As well as the many benefits associated with the evolution of multispectral sensors into hyperspectral sensors there is also a considerable increase in storage space and the computational load to process the data. Consequently the remote sensing ommunity is investigating and developing statistical methods to alleviate these problems. / The research presented here investigates several approaches to reducing the dimensionality of hyperspectral remotely sensed data while maintaining the levels of accuracy achieved using the full dimensionality of the data. It was conducted with an emphasis on applications in maximum likelihood classification (MLC) of hyperspectral image data. An inherent characteristic of hyperspectral data is that adjacent bands are typically highly correlated and this results in a high level of redundancy in the data. The high correlations between adjacent bands can be exploited to realise significant reductions in the dimensionality of the data, for a negligible reduction in classification accuracy. / The high correlations between neighbouring bands is related to their response functions overlapping with each other by a large amount. The spectral band filter functions were modelled for the HyMap instrument that acquires hyperspectral data used in this study. The results were compared with measured filter function data from a similar, more recent HyMap instrument. The results indicated that on average HyMap spectral band filter functions exhibit overlaps with their neighbouring bands of approximately 60%. This is considerable and partly accounts for the high correlation between neighbouring spectral bands on hyperspectral instruments. / A hyperspectral HyMap image acquired over an agricultural region in the south west of Western Australia has been used for this research. The image is composed of 512 × 512 pixels, with each pixel having a spatial resolution of 3.5 m. The data was initially reduced from 128 spectral bands to 82 spectral bands by removing the highly overlapping spectral bands, those which exhibit high levels of noise and those bands located at strong atmospheric absorption wavelengths. The image was examined and found to contain 15 distinct spectral classes. Training data was selected for each of these classes and class spectral mean and covariance matrices were generated. / The discriminant function for MLC makes use of not only the measured pixel spectra but also the sample class covariance matrices. This thesis first examines reducing the parameterization of these covariance matrices for use by the MLC algorithm. The full dimensional spectra are still used for the classification but the number of parameters needed to describe the covariance information is significantly reduced. When a threshold of 0.04 was used in conjunction with the partial correlation matrices to identify low values in the inverse covariance matrices, the resulting classification accuracy was 96.42%. This was achieved using only 68% of the elements in the original covariance matrices. / Both wavelet techniques and cubic splines were investigated as a means of representing the measured pixel spectra with considerably fewer bands. Of the different mother wavelets used, it was found that the Daubechies-4 wavelet performed slightly better than the Haar and Daubechies-6 wavelets at generating accurate spectra with the least number of parameters. The wavelet techniques investigated produced more accurately modelled spectra compared with cubic splines with various knot selection approaches. A backward stepwise knot selection technique was identified to be more effective at approximating the spectra than using regularly spaced knots. A forward stepwise selection technique was investigated but was determined to be unsuited to this process. / All approaches were adapted to process an entire hyperspectral image and the subsequent images were classified using MLC. Wavelet approximation coefficients gave slightly better classification results than wavelet detail coefficients and the Haar wavelet proved to be a more superior wavelet for classification purposes. With 6 approximation coefficients, the Haar wavelet could be used to classify the data with an accuracy of 95.6%. For 11 approximation coefficients this figure increased to 96.1%. / First and second derivative spectra were also used in the classification of the image. The first and second derivatives were determined for each of the class spectral means and for each band the standard deviations were calculated of both the first and second derivatives. Bands were then ranked in order of decreasing standard deviation. Bands showing the highest standard deviations were identified and the derivatives were generated for the entire image at these wavelengths. The resulting first and second derivative images were then classified using MLC. Using 25 spectral bands classification accuracies of approximately 96% and 95% were achieved using the first and second derivative images respectively. These results are comparable with those from using wavelets although wavelets produced higher classification accuracies when fewer coefficients were used.
|
2 |
An empirical study of the impact of data dimensionality on the performance of change point detection algorithms / En empirisk studie av data dimensionalitetens påverkan på change point detection algoritmers prestandaNoharet, Léo January 2023 (has links)
When a system is monitored over time, changes can be discovered in the time series of monitored variables. Change Point Detection (CPD) aims at finding the time point where a change occurs in the monitored system. While CPD methods date back to the 1950’s with applications in quality control, few studies have been conducted on the impact of data dimensionality on CPD algorithms. This thesis intends to address this gap by examining five different algorithms using synthetic data that incorporates changes in mean, covariance, and frequency across dimensionalities up to 100. Additionally, the algorithms are evaluated on a collection of data sets originating from various domains. The studied methods are then assessed and ranked based on their performance on both synthetic and real data sets, to aid future users in selecting an appropriate CPD method. Finally, stock data from the 30 most traded companies on the Swedish stock market are collected to create a new CPD data set to which the CPD algorithms are applied. The changes of the monitored system that the CPD algorithms aim to detect are the changes in policy rate set by the Swedish central bank, Riksbank. The results of the thesis show that the dimensionality impacts the accuracy of the methods when noise is present and when the degree of mean or covariance change is small. Additionally, the application of the algorithms on real world data sets reveals large differences in performance between the studied methods, underlining the importance of comparison studies. Ultimately, the kernel based CPD method performed the best across the real world data set employed in the thesis. / När system övervakas över tid kan förändringar upptäckas i de uppmätade variablers tidsseriedata. Change Point Detection (CPD) syftar till att hitta tidpunkten då en förändring inträffar i det övervakade systemet’s tidseriedata. Medan CPD-metoder har sitt urspring i kvalitetskontroll under 1950-talet, har få studier undersökt datans dimensionalitets påverkan på CPD-algoritmer’s förmåga. Denna avhandling avser att fylla denna kunskapslucka genom att undersöka fem olika algoritmer med hjälp av syntetiska data som inkorporerar förändringar i medelvärde, kovarians och frekvens över dimensioner upp till 100. Dessutom jämförs algoritmerna med hjälp av en samling av data från olika domäner. De studerade metoderna bedöms och rangordnas sedan baserat på deras prestanda på både syntetiska och verkliga datauppsättningar för att hjälpa framtida användare att välja en lämplig CPD algoritm. Slutligen har aktiedata samlats från de 30 mest handlade företagen på den svenska aktiemarknaden för att skapa ett nytt data set. De förändringar i det övervakade systemet som CPD-algoritmerna syftar till att upptäcka är förändringarna i styrräntan som fastställs av Riksbanken. Resultaten av studien tyder på att dimensionaliteten påverkar förmågan hos algoritmerna att upptäcka förändringspunkterna när brus förekommer i datan och när graden av förändringen är liten. Dessutom avslöjar tillämpningen av algoritmerna på den verkliga datan stora skillnader i prestanda mellan de studerade metoderna, vilket understryker vikten av jämförelsestudier för att avslöja dessa skillnader. Slutligen presterade den kernel baserade CPD metoden bäst.
|
3 |
Numerische Methoden zur Analyse hochdimensionaler Daten / Numerical Methods for Analyzing High-Dimensional DataHeinen, Dennis 01 July 2014 (has links)
Diese Dissertation beschäftigt sich mit zwei der wesentlichen Herausforderungen, welche bei der Bearbeitung großer Datensätze auftreten, der Dimensionsreduktion und der Datenentstörung. Der erste Teil dieser Dissertation liefert eine Zusammenfassung über Dimensionsreduktion. Ziel der Dimensionsreduktion ist eine sinnvolle niedrigdimensionale Darstellung eines vorliegenden hochdimensionalen Datensatzes. Insbesondere diskutieren und vergleichen wir bewährte Methoden des Manifold-Learning. Die zentrale Annahme des Manifold-Learning ist, dass der hochdimensionale Datensatz (approximativ) auf einer niedrigdimensionalen Mannigfaltigkeit liegt. Störungen im Datensatz sind bei allen Dimensionsreduktionsmethoden hinderlich.
Der zweite Teil dieser Dissertation stellt eine neue Entstörungsmethode für hochdimensionale Daten vor, eine Wavelet-Shrinkage-Methode für die Glättung verrauschter Abtastwerte einer zugrundeliegenden multivariaten stückweise stetigen Funktion, wobei die Abtastpunkte gestreut sein können. Die Methode stellt eine Verallgemeinerung und Weiterentwicklung der für die Bildkompression eingeführten "Easy Path Wavelet Transform" (EPWT) dar. Grundlage ist eine eindimensionale Wavelet-Transformation entlang (adaptiv) zu konstruierender Pfade durch die Abtastpunkte. Wesentlich für den Erfolg der Methode sind passende adaptive Pfadkonstruktionen. Diese Dissertation beinhaltet weiterhin eine kurze Diskussion der theoretischen Eigenschaften von Wavelets entlang von Pfaden sowie numerische Resultate und schließt mit möglichen Modifikationen der Entstörungsmethode.
|
Page generated in 0.1098 seconds