Global ETD Search

1	Acoustic Feature Transformation Combining Average and Maximum Classification Error Minimization Criteria TAKEDA, Kazuya, KITAOKA, Norihide, SAKAI, Makoto 01 July 2010 (has links) No description available. Bayes error dimensionality reduction speech recognition
2	A New Measure of Classifiability and its Applications Dong, Ming 08 November 2001 (has links) No description available. pattern recognition and classification classifiability Bayes Error decision tree feature subset selection
3	Bayes Optimal Feature Selection for Supervised Learning Saneem Ahmed, C G January 2014 (has links) (PDF) The problem of feature selection is critical in several areas of machine learning and data analysis such as, for example, cancer classification using gene expression data, text categorization, etc. In this work, we consider feature selection for supervised learning problems, where one wishes to select a small set of features that facilitate learning a good prediction model in the reduced feature space. Our interest is primarily in filter methods that select features independently of the learning algorithm to be used and are generally faster to implement compared to other types of feature selection algorithms. Many common filter methods for feature selection make use of information-theoretic criteria such as those based on mutual information to guide their search process. However, even in simple binary classification problems, mutual information based methods do not always select the best set of features in terms of the Bayes error. In this thesis, we develop a general approach for selecting a set of features that directly aims to minimize the Bayes error in the reduced feature space with respect to the loss or performance measure of interest. We show that the mutual information based criterion is a special case of our setting when the loss function of interest is the logarithmic loss for class probability estimation. We give a greedy forward algorithm for approximately optimizing this criterion and demonstrate its application to several supervised learning problems including binary classification (with 0-1 error, cost-sensitive error, and F-measure), binary class probability estimation (with logarithmic loss), bipartite ranking (with pairwise disagreement loss), and multiclass classification (with multiclass 0-1 error). Our experiments suggest that the proposed approach is competitive with several state-of-the art methods. Data Analysis Logarithms Supervised Learning Bayes Optimality Binary Classsification Bipartite Ranking Multiclass Classification Bayes Optimal Feature Selection Optimal Feature Selection Bayes Error Binary Class Probability Estimation Supervised Learning Problems Computer Science
4	Random parameters in learning: advantages and guarantees Evzenie Coupkova (18396918) 22 April 2024 (has links) <p dir="ltr">The generalization error of a classifier is related to the complexity of the set of functions among which the classifier is chosen. We study a family of low-complexity classifiers consisting of thresholding a random one-dimensional feature. The feature is obtained by projecting the data on a random line after embedding it into a higher-dimensional space parametrized by monomials of order up to k. More specifically, the extended data is projected n-times and the best classifier among those n, based on its performance on training data, is chosen. </p><p dir="ltr">We show that this type of classifier is extremely flexible, as it is likely to approximate, to an arbitrary precision, any continuous function on a compact set as well as any Boolean function on a compact set that splits the support into measurable subsets. In particular, given full knowledge of the class conditional densities, the error of these low-complexity classifiers would converge to the optimal (Bayes) error as k and n go to infinity. On the other hand, if only a training dataset is given, we show that the classifiers will perfectly classify all the training points as k and n go to infinity. </p><p dir="ltr">We also bound the generalization error of our random classifiers. In general, our bounds are better than those for any classifier with VC dimension greater than O(ln(n)). In particular, our bounds imply that, unless the number of projections n is extremely large, there is a significant advantageous gap between the generalization error of the random projection approach and that of a linear classifier in the extended space. Asymptotically, as the number of samples approaches infinity, the gap persists for any such n. Thus, there is a potentially large gain in generalization properties by selecting parameters at random, rather than optimization. </p><p dir="ltr">Given a classification problem and a family of classifiers, the Rashomon ratio measures the proportion of classifiers that yield less than a given loss. Previous work has explored the advantage of a large Rashomon ratio in the case of a finite family of classifiers. Here we consider the more general case of an infinite family. We show that a large Rashomon ratio guarantees that choosing the classifier with the best empirical accuracy among a random subset of the family, which is likely to improve generalizability, will not increase the empirical loss too much. </p><p dir="ltr">We quantify the Rashomon ratio in two examples involving infinite classifier families in order to illustrate situations in which it is large. In the first example, we estimate the Rashomon ratio of the classification of normally distributed classes using an affine classifier. In the second, we obtain a lower bound for the Rashomon ratio of a classification problem with a modified Gram matrix when the classifier family consists of two-layer ReLU neural networks. In general, we show that the Rashomon ratio can be estimated using a training dataset along with random samples from the classifier family and we provide guarantees that such an estimation is close to the true value of the Rashomon ratio.</p> Pattern recognition Deep learning Probability theory Statistical data science random classifiers rashomon ratio approximation error generalization error classifier complexity sample complexity random projection neural network Polynomial expansion modified Gram matrix epsilon-cover chaining covering number Bayes error reducible error epsilon-net random parameters generalizability classification growth function VC dimension projection of the data labels training error test error

1

Page generated in 0.0273 seconds