Global ETD Search

Return to search

Forêts Aléatoires PAC-Bayésiennes

Dans ce mémoire de maîtrise, nous présentons dans un premier temps un algorithme
de l'état de l'art appelé Forêts aléatoires introduit par Léo Breiman. Cet algorithme
effectue un vote de majorité uniforme d'arbres de décision construits en utilisant l'algorithme
CART sans élagage. Par après, nous introduisons l'algorithme que nous avons
nommé SORF. L'algorithme SORF s'inspire de l'approche PAC-Bayes, qui pour minimiser
le risque du classificateur de Bayes, minimise le risque du classificateur de Gibbs
avec un régularisateur. Le risque du classificateur de Gibbs constitue en effet, une fonction
convexe bornant supérieurement le risque du classificateur de Bayes. Pour chercher
la distribution qui pourrait être optimale, l'algorithme SORF se réduit à être un simple
programme quadratique minimisant le risque quadratique de Gibbs pour chercher une
distribution Q sur les classificateurs de base qui sont des arbres de la forêt. Les résultasts
empiriques montrent que généralement SORF est presqu'aussi bien performant
que les forêts aléatoires, et que dans certains cas, il peut même mieux performer que
les forêts aléatoires. / In this master's thesis, we present at first an algorithm of the state of the art called Random
Forests introduced by Léo Breiman. This algorithm construct a uniformly weighted
majority vote of decision trees built using the CART algorithm without pruning. Thereafter,
we introduce an algorithm that we called SORF. The SORF algorithm is based
on the PAC-Bayes approach, which in order to minimize the risk of Bayes classifier,
minimizes the risk of the Gibbs classifier with a regularizer. The risk of Gibbs classifier
is indeed a convex function which is an upper bound of the risk of Bayes classifier. To
find the distribution that would be optimal, the SORF algorithm is reduced to being
a simple quadratic program minimizing the quadratic risk of Gibbs classifier to seek a
distribution Q of base classifiers which are trees of the forest. Empirical results show
that generally SORF is almost as efficient as Random forests, and in some cases, it can
even outperform Random forests.

Informatique

Identifer	oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:QQLA.2013/29815
Date	03 1900
Creators	Zirakiza, Brice
Contributors	Marchand, Mario, Laviolette, François
Publisher	Université Laval
Source Sets	Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
Language	French
Detected Language	English
Type	Electronic Thesis or Dissertation
Format	application/pdf
Rights	© Brice Zirakiza, 2013

Page generated in 0.0015 seconds

Forêts Aléatoires PAC-Bayésiennes

Description

Links & Downloads

Tags

Additional Fields