Global ETD Search

Return to search

Understanding neural network sample complexity and interpretable convergence-guaranteed deep learning with polynomial regression

Thesis: S.M., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, May, 2020 / Cataloged from PDF version of thesis. / Includes bibliographical references (pages 83-89). / We first study the sample complexity of one-layer neural networks, namely the number of examples that are needed in the training set for such models to be able to learn meaningful information out-of-sample. We empirically derive quantitative relationships between the sample complexity and the parameters of the network, such as its input dimension and its width. Then, we introduce polynomial regression as a proxy for neural networks through a polynomial approximation of their activation function. This method operates in the lifted space of tensor products of input variables, and is trained by simply optimizing a standard least squares objective in this space. We study the scalability of polynomial regression, and are able to design a bagging-type algorithm to successfully train it. The method achieves competitive accuracy on simple image datasets while being more simple. We also demonstrate that it is more robust and more interpretable that existing approaches. It also offers more convergence guarantees during training. Finally, we empirically show that the widely-used Stochastic Gradient Descent algorithm makes the weights of the trained neural networks converge to the optimal polynomial regression weights. / by Matt V. Emschwiller. / S.M. / S.M. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center

Operations Research Center.

Identifer	oai:union.ndltd.org:MIT/oai:dspace.mit.edu:1721.1/127290
Date	January 2020
Creators	Emschwiller, Matt V.
Contributors	David Gamarnik., Massachusetts Institute of Technology. Operations Research Center., Massachusetts Institute of Technology. Operations Research Center
Publisher	Massachusetts Institute of Technology
Source Sets	M.I.T. Theses and Dissertation
Language	English
Detected Language	English
Type	Thesis
Format	89 pages, application/pdf
Rights	MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided., http://dspace.mit.edu/handle/1721.1/7582

Page generated in 0.0023 seconds

Understanding neural network sample complexity and interpretable convergence-guaranteed deep learning with polynomial regression

Description

Links & Downloads

Tags

Additional Fields