Spelling suggestions: "subject:"overparameterization"" "subject:"overparametrization""
1 |
On the bias-variance tradeoff : textbooks need an updateNeal, Brayden 12 1900 (has links)
L’objectif principal de cette thèse est de souligner que le compromis biais-variance n’est
pas toujours vrai (p. ex. dans les réseaux neuronaux). Nous plaidons pour que ce manque
d’universalité soit reconnu dans les manuels scolaires et enseigné dans les cours d’introduction
qui couvrent le compromis.
Nous passons d’abord en revue l’historique du compromis entre les biais et les variances,
sa prévalence dans les manuels scolaires et certaines des principales affirmations faites au
sujet du compromis entre les biais et les variances. Au moyen d’expériences et d’analyses
approfondies, nous montrons qu’il n’y a pas de compromis entre la variance et le biais dans
les réseaux de neurones lorsque la largeur du réseau augmente. Nos conclusions semblent
contredire les affirmations de l’oeuvre historique de Geman et al. (1992). Motivés par cette
contradiction, nous revisitons les mesures expérimentales dans Geman et al. (1992). Nous
discutons du fait qu’il n’y a jamais eu de preuves solides d’un compromis dans les réseaux
neuronaux lorsque le nombre de paramètres variait. Nous observons un phénomène similaire
au-delà de l’apprentissage supervisé, avec un ensemble d’expériences d’apprentissage de
renforcement profond.
Nous soutenons que les révisions des manuels et des cours magistraux ont pour but
de transmettre cette compréhension moderne nuancée de l’arbitrage entre les biais et les
variances. / The main goal of this thesis is to point out that the bias-variance tradeoff is not always
true (e.g. in neural networks). We advocate for this lack of universality to be acknowledged
in textbooks and taught in introductory courses that cover the tradeoff.
We first review the history of the bias-variance tradeoff, its prevalence in textbooks,
and some of the main claims made about the bias-variance tradeoff. Through extensive
experiments and analysis, we show a lack of a bias-variance tradeoff in neural networks
when increasing network width. Our findings seem to contradict the claims of the landmark
work by Geman et al. (1992). Motivated by this contradiction, we revisit the experimental
measurements in Geman et al. (1992). We discuss that there was never strong evidence
for a tradeoff in neural networks when varying the number of parameters. We observe a
similar phenomenon beyond supervised learning, with a set of deep reinforcement learning
experiments.
We argue that textbook and lecture revisions are in order to convey this nuanced modern
understanding of the bias-variance tradeoff.
|
2 |
Toward a Theory of Auto-modelingYiran Jiang (16632711) 25 July 2023 (has links)
<p>Statistical modeling aims at constructing a mathematical model for an existing data set. As a comprehensive concept, statistical modeling leads to a wide range of interesting problems. Modern parametric models, such as deepnets, have achieved remarkable success in quite a few application areas with massive data. Although being powerful in practice, many fitted over-parameterized models potentially suffer from losing good statistical properties. For this reason, a new framework named the Auto-modeling (AM) framework is proposed. Philosophically, the mindset is to fit models to future observations rather than the observed sample. Technically, choosing an imputation model for generating future observations, we fit models to future observations via optimizing an approximation to the desired expected loss function based on its sample counterpart and what we call an adaptive {\it duality function}.</p>
<p><br></p>
<p>The first part of the dissertation (Chapter 2 to 7) focuses on the new philosophical perspective of the method, as well as the details of the main framework. Technical details, including essential theoretical properties of the method are also investigated. We also demonstrate the superior performance of the proposed method via three applications: Many-normal-means problem, $n < p$ linear regression and image classification.</p>
<p><br></p>
<p>The second part of the dissertation (Chapter 8) focuses on the application of the AM framework to the construction of linear regression models. Our primary objective is to shed light on the stability issue associated with the commonly used data-driven model selection methods such as cross-validation (CV). Furthermore, we highlight the philosophical distinctions between CV and AM. Theoretical properties and numerical examples presented in the study demonstrate the potential and promise of AM-based linear model selection. Additionally, we have devised a conformal prediction method specifically tailored for quantifying the uncertainty of AM predictions in the context of linear regression.</p>
|
Page generated in 0.1332 seconds