Spelling suggestions: "subject:"braining set"" "subject:"deraining set""
1 |
Spectral Pattern Recognition by a Two-Layer Perceptron: Effects of Training Set SizeFischer, Manfred M., Staufer-Steinnocher, Petra 10 1900 (has links) (PDF)
Pattern recognition in urban areas is one of the most challenging issues in
classifying satellite remote sensing data. Parametric pixel-by-pixel classification
algorithms tend to perform poorly in this context. This is because urban areas
comprise a complex spatial assemblage of disparate land cover types - including
built structures, numerous vegetation types, bare soil and water bodies. Thus,
there is a need for more powerful spectral pattern recognition techniques,
utilizing pixel-by-pixel spectral information as the basis for automated urban
land cover detection. This paper adopts the multi-layer perceptron classifier
suggested and implemented in [5]. The objective of this study is to analyse the
performance and stability of this classifier - trained and tested for supervised
classification (8 a priori given land use classes) of a Landsat-5 TM image
(270 x 360 pixels) from the city of Vienna and its northern surroundings
- along with varying the training data set in the single-training-site case.
The performance is measured in terms of total classification, map user's and
map producer's accuracies. In addition, the stability with initial parameter
conditions, classification error matrices, and error curves are analysed in some
detail. (authors' abstract) / Series: Discussion Papers of the Institute for Economic Geography and GIScience
|
2 |
Improvement Of Land Cover Classification With The Integration Of Topographical Data In Uneven TerrainGercek, Deniz 01 November 2002 (has links) (PDF)
The aim of this study is to develop a framework for the integration of ancillary topographic information into supervised image classification to improve the accuracy of the classification product. Integration of topographic data into classification is basically through modification of training set in order to provide additional sensitivity to topographical characteristics associated with each land cover class in the study area. Multi-spectral Landsat 7 ETM 30x30 meter bands are the remotely sensed data used in the study. Ancillary topographic data are elevation, slope and aspect derived from 1/25000 scaled topographic map contours. A five-phase methodological framework was proposed for developing procedures for the integration of topographical data into a standard image classification task. Briefly / first phase is the selection of initial class spectral signatures, second phase is analyzing the information content of class spectral signatures and topographical data for a potential relationship, and quantification of the related topographical data. Third phase is the selection of class topographical signatures from the related topographical data. Fourth phase is redefinition of two training sets where one of which includes spectral information
only and the other includes both spectral and topographical information. The last phase is classification. Two products were derived where, first product used bands as input and was trained by spectral information only and the second was
the product for which bands and topographical data was used as input and it was trained with both spectral and topographical information. Method was applied to image and associated ancillary topographical data covering rural lands mainly composed of agricultural practices and rangelands
in Ankara. Method provided an improvement of 10% in overall accuracy for the classification with the integration of topographical data compared to that depended only on spectral data from remotely sensed images.
|
3 |
Robustness Studies and Training Set Analysis for HIDSHelmrich, Daniel 09 September 2024 (has links)
To enhance the protection against cyberattacks, significant research is directed towards
anomaly-based host intrusion detection systems (HIDS), which particularly appear suited for detecting zero-day attacks. This thesis addresses two problems in HIDS training sets that are often neglected in other publications: unclean and incomplete data. First, using the Leipzig Intrusion Detection - Data Set (LID-DS), a methodology to measure HIDS robustness against contaminated training data is presented. Furthermore, three baseline HIDS approaches (STIDE, SCG, and SOM) are evaluated, and robustness improvements are proposed for them. The results indicate that the baselines are not robust if test and training data share identical attacks. However, the suggested modifications, particularly the removal of anomalous threads from the training set, can enhance robustness significantly. For the problem of incomplete training data, the thesis leverages machine learning models to predict a training set’s suitability, quantified by either data drift measures or the STIDE performance. The thesis then presents rules, extracted from the best models, for assessing the suitability of new training data. Given the practical significance of both issues, for contaminated training data emphasized by the results, further research is essential. This involves examining the robustness of other HIDS algorithms, refining the proposed robustness improvements, and validating the suitability rules on other datasets, preferably real-world data.
|
4 |
Adaptive Training Set Formation / Adaptyvus mokymo imties formavimasŽliobaitė, Indrė 16 April 2010 (has links)
Nowadays, when the environment is changing rapidly and dynamically, there is a particular need for adaptive data mining methods. `Spam' filters, personalized recommender and marketing systems, network intrusion detection systems, business prediction and decision support systems need to be regularly retrained to take into account changing nature of the data. In the stationary settings the more data is at hand, the more accurate model can be trained. In the changing environment an old data decreases the accuracy. In such a case only a subset of the historical data might be selected to form a training set. For instance, the training window strategy uses only the newest historical instances. In the thesis adaptive data mining methods are addressed, which are based on selective training set formation. The thesis improves the training strategies under sudden, gradual and recurring concept drifts. Four adaptive training set formation algorithms are developed and experimentally validated, which allow to increase the generalization performance of the base models under each of the three concept drift types. Experimental evaluation using generated and real data confirms improvement of the classification and prediction accuracies as compared to using all the historical data as well as the selected existing adaptive learning algorithms from the recent literature. A tailored method for an industrial boiler application, which unifies several drift types, is developed. / Šiandieninėje, dinamiškai besikeičiančioje aplinkoje reikalingi adaptyvūs duomenų gavybos metodai. Nepageidaujamų laiškų klasifikatoriai, asmeninio rekomendavimo ir rinkodaros, įsilaužimų į kompiuterinius tinklus aptikimo, verslo rodiklių prognozavimo bei sprendimų priėmimo sistemos turi nuolat “persimokyti”, reaguoti į besikeičiančius duomenis. Stacionarioje aplinkoje kuo daugiau mokymo duomenų - tuo tikslesnis modelis. Besikeičiančioje aplinkoje seni duomenys blogina tikslumą. Tokiu atveju, vietoje visų turimų istorinių duomenų panaudojimo, gali būti tikslingai išrenkama tik tam tikra jų dalis, pvz. naudojamas mokymo langas (tik naujausi duomenys). Tiriamojo darbo objektas yra adaptyvūs mokymo metodai, kurie remiasi kryptingu mokymo imties formavimu. Darbe patobulintos mokymo strategijos esant staigiems, palaipsniams ir pasikartojantiems pokyčiams. Sukurti ir eksperimentiškai aprobuoti keturi adaptyvaus mokymo imties formavimo algoritmai, kurie leidžia pagerinti klasifikavimo bei prognozavimo tikslumą besikeičiančiose aplinkose, esant atitinkamai kiekvienam iš trijų pokyčių tipų. Naudojant generuotus bei realius duomenis eksperimentiškai parodytas klasifikavimo bei prognozavimo tikslumo pagerėjimas, lyginant su visų istorinių duomenų naudojimu mokymui, bei žinomais šioje srityje naudojamais adaptyviais mokymo algoritmais. Sukurta metodika pritaikyta pramoninio katilo atvejui, jungiančiam kelis aplinkos pokyčių tipus.
|
5 |
Adaptyvus mokymo imties formavimas / Adaptive Training Set FormationŽliobaitė, Indrė 16 April 2010 (has links)
Šiandieninėje, dinamiškai besikeičiančioje aplinkoje reikalingi adaptyvūs duomenų gavybos metodai. Nepageidaujamų laiškų klasifikatoriai, asmeninio rekomendavimo ir rinkodaros, įsilaužimų į kompiuterinius tinklus aptikimo, verslo rodiklių prognozavimo bei sprendimų priėmimo sistemos turi nuolat “persimokyti”, reaguoti į besikeičiančius duomenis. Stacionarioje aplinkoje kuo daugiau mokymo duomenų - tuo tikslesnis modelis. Besikeičiančioje aplinkoje seni duomenys blogina tikslumą. Tokiu atveju, vietoje visų turimų istorinių duomenų panaudojimo, gali būti tikslingai išrenkama tik tam tikra jų dalis, pvz. naudojamas mokymo langas (tik naujausi duomenys). Tiriamojo darbo objektas yra adaptyvūs mokymo metodai, kurie remiasi kryptingu mokymo imties formavimu. Darbe patobulintos mokymo strategijos esant staigiems, palaipsniams ir pasikartojantiems pokyčiams. Sukurti ir eksperimentiškai aprobuoti keturi adaptyvaus mokymo imties formavimo algoritmai, kurie leidžia pagerinti klasifikavimo bei prognozavimo tikslumą besikeičiančiose aplinkose, esant atitinkamai kiekvienam iš trijų pokyčių tipų. Naudojant generuotus bei realius duomenis eksperimentiškai parodytas klasifikavimo bei prognozavimo tikslumo pagerėjimas, lyginant su visų istorinių duomenų naudojimu mokymui, bei žinomais šioje srityje naudojamais adaptyviais mokymo algoritmais. Sukurta metodika pritaikyta pramoninio katilo atvejui, jungiančiam kelis aplinkos pokyčių tipus. / Nowadays, when the environment is changing rapidly and dynamically, there is a particular need for adaptive data mining methods. `Spam' filters, personalized recommender and marketing systems, network intrusion detection systems, business prediction and decision support systems need to be regularly retrained to take into account changing nature of the data. In the stationary settings the more data is at hand, the more accurate model can be trained. In the changing environment an old data decreases the accuracy. In such a case only a subset of the historical data might be selected to form a training set. For instance, the training window strategy uses only the newest historical instances. In the thesis adaptive data mining methods are addressed, which are based on selective training set formation. The thesis improves the training strategies under sudden, gradual and recurring concept drifts. Four adaptive training set formation algorithms are developed and experimentally validated, which allow to increase the generalization performance of the base models under each of the three concept drift types. Experimental evaluation using generated and real data confirms improvement of the classification and prediction accuracies as compared to using all the historical data as well as the selected existing adaptive learning algorithms from the recent literature. A tailored method for an industrial boiler application, which unifies several drift types, is developed.
|
6 |
A Diatom Phosphorus Inference Model for 30 Freshwater Lakes in NE Ohio and NW PennsylvaniaScotese, Kyle C. January 2008 (has links)
No description available.
|
7 |
Strategy for construction of polymerized volume data setsAragonda, Prathyusha 12 April 2006 (has links)
This thesis develops a strategy for polymerized volume data set construction.
Given a volume data set defined over a regular three-dimensional grid, a polymerized
volume data set (PVDS) can be defined as follows: edges between adjacent vertices of
the grid are labeled 1 (active) or 0 (inactive) to indicate the likelihood that an edge is
contained in (or spans the boundary of) a common underlying object, adding information
not in the original volume data set. This edge labeling Âpolymerizes adjacent voxels
(those sharing a common active edge) into connected components, facilitating
segmentation of embedded objects in the volume data set. Polymerization of the volume
data set also aids real-time data compression, geometric modeling of the embedded
objects, and their visualization.
To construct a polymerized volume data set, an adjacency class within the grid
system is selected. Edges belonging to this adjacency class are labeled as interior,
exterior, or boundary edges using discriminant functions whose functional forms are
derived for three local adjacency classes. The discriminant function parameter values are
determined by supervised learning. Training sets are derived from an initial
segmentation on a homogeneous sample of the volume data set, using an existing
segmentation method.
The strategy of constructing polymerized volume data sets is initially tested on
synthetic data sets which resemble neuronal volume data obtained by three-dimensional
microscopy. The strategy is then illustrated on volume data sets of mouse brain
microstructure at a neuronal level of detail. Visualization and validation of the resulting
PVDS is shown in both cases. Finally the procedures of polymerized volume data set construction are
generalized to apply to any Bravais lattice over the regular 3D orthogonal grid. Further
development of this latter topic is left to future work.
|
8 |
Interaktivní segmentace 3D CT dat s využitím hlubokého učení / Interactive 3D CT Data Segmentation Based on Deep LearningTrávníčková, Kateřina January 2020 (has links)
This thesis deals with CT data segmentation using convolutional neural nets and describes the problem of training with limited training sets. User interaction is suggested as means of improving segmentation quality for the models trained on small training sets and the possibility of using transfer learning is also considered. All of the chosen methods help improve the segmentation quality in comparison with the baseline method, which is the use of automatic data specific segmentation model. The segmentation has improved by tens of percents in Dice score when trained with very small datasets. These methods can be used, for example, to simplify the creation of a new segmentation dataset.
|
Page generated in 0.0824 seconds