Spelling suggestions: "subject:"dataset classification"" "subject:"dataset 1classification""
1 |
Optimalizace tvorby trénovacího a validačního datasetu pro zvýšení přesnosti klasifikace v dálkovém průzkumu Země / Training and validation dataset optimization for Earth observation classification accuracy improvementPotočná, Barbora January 2019 (has links)
This thesis deals with training dataset and validation dataset for Earth observation classification accuracy improvement. Experiments with training data and validation data for two classification algorithms (Maximum Likelihood - MLC and Support Vector Machine - SVM) are carried out from the forest-meadow landscape located in the foothill of the Giant Mountains (Podkrkonoší). The thesis is base on the assumption that 1/3 of training data and 2/3 of validation data is an ideal ratio to achieve maximal classification accuracy (Foody, 2009). Another hypothesis was that in a case of SVM classification, a lower number of training point is required to achieve the same or similar accuracy of classification, as in the case of the MLC algorithm (Foody, 2004). The main goal of the thesis was to test the influence of proportion / amount of training and validation data on the classification accuracy of Sentinel - 2A multispectral data using the MLC algorithm. The highest overal accuracy using the MLC classification algorithm was achieved for 375 training and 625 validation points. The overal accuracy for this ratio was 72,88 %. The theory of Foody (2009) that 1/3 of training data and 2/3 of validation data is an ideal ratio to achieve the highest classification accuracy, was confirmed by the overal accuracy and...
|
2 |
Learning Algorithms Using Chance-Constrained ProgramsJagarlapudi, Saketha Nath 07 1900 (has links)
This thesis explores Chance-Constrained Programming (CCP) in the context of learning. It is shown that chance-constraint approaches lead to improved algorithms for three important learning problems — classification with specified error rates, large dataset classification and Ordinal Regression (OR). Using moments of training data, the CCPs are posed as Second Order Cone Programs (SOCPs). Novel iterative algorithms for solving the resulting SOCPs are also derived. Borrowing ideas from robust optimization theory, the proposed formulations are made robust to moment estimation errors.
A maximum margin classifier with specified false positive and false negative rates is derived. The key idea is to employ chance-constraints for each class which imply that the actual misclassification rates do not exceed the specified. The formulation is applied to the case of biased classification.
The problems of large dataset classification and ordinal regression are addressed by deriving formulations which employ chance-constraints for clusters in training data rather than constraints for each data point. Since the number of clusters can be substantially smaller than the number of data points, the resulting formulation size and number of inequalities are very small. Hence the formulations scale well to large datasets.
The scalable classification and OR formulations are extended to feature spaces and the kernelized duals turn out to be instances of SOCPs with a single cone constraint. Exploiting this speciality, fast iterative solvers which outperform generic SOCP solvers, are proposed. Compared to state-of-the-art learners, the proposed algorithms achieve a speed up as high as 10000 times, when the specialized SOCP solvers are employed.
The proposed formulations involve second order moments of data and hence are susceptible to moment estimation errors. A generic way of making the formulations robust to such estimation errors is illustrated. Two novel confidence sets for moments are derived and it is shown that when either of the confidence sets are employed, the robust formulations also yield SOCPs.
|
Page generated in 0.1173 seconds