Classification in machine learning has been extensively studied during the pastdecades. Many solutions have been proposed to output accurate classifiers and toobtain statistical grantees on the unseen observations. However, when machinelearning algorithms meet concrete industrial or scientific applications, new computationalcriteria appear to be as important to satisfy as those of classificationaccuracy. In particular, when the output classifier must comply with a computationalbudget needed to obtain the features that are evaluated at test time, wetalk about "budgeted" learning. The features can have different acquisition costsand, often, the most discriminative features are the costlier. Medical diagnosis andweb-page ranking, for instance, are typical applications of budgeted learning. Inthe former, the goal is to limit the number of medical tests evaluate for patients,and in the latter, the ranker has limited time to order documents before the usergoes away.This thesis introduces a new way of tackling classification in general and budgetedlearning problems in particular, through a novel framework lying in theintersection of supervised learning and decision theory. We cast the classificationproblem as a sequential decision making procedure and show that this frameworkyields fast and accurate classifiers. Unlike classical classification algorithms thatoutput a "one-shot" answer, we show that considering the classification as a seriesof small steps wherein the information is gathered sequentially also providesa flexible framework that allows to accommodate different types of budget constraintsin a "natural" way. In particular, we apply our method to a novel type ofbudgeted learning problems motivated by particle physics experiments. The particularityof this problem lies in atypical budget constraints and complex cost calculationschemata where the calculation of the different features depends on manyfactors. We also review similar sequential approaches that have recently known aparticular interest and provide a global perspective on this new paradigm.
Identifer | oai:union.ndltd.org:CCSD/oai:tel.archives-ouvertes.fr:tel-00990245 |
Date | 20 February 2014 |
Creators | Benbouzid, Djalel |
Publisher | Université Paris Sud - Paris XI |
Source Sets | CCSD theses-EN-ligne, France |
Language | English |
Detected Language | English |
Type | PhD thesis |
Page generated in 0.0024 seconds