Return to search

Non-convex Stochastic Optimization With Biased Gradient Estimators

Non-convex optimization problems appear in various applications of machine learning. Because of their practical importance, these problems gained a lot of attention in recent years, leading to the rapid development of new efficient stochastic gradient-type methods. In the quest to improve the generalization performance of modern deep learning models, practitioners are resorting to using larger and larger datasets in the training process, naturally distributed across a number of edge devices. However, with the increase of trainable data, the computational costs of gradient-type methods increase significantly. In addition, distributed methods almost invariably suffer from the so-called communication bottleneck: the cost of communication of the information necessary for the workers to jointly solve the problem is often very high, and it can be orders of magnitude higher than the cost of computation. This thesis provides a study of first-order stochastic methods addressing these issues. In particular, we structure this study by considering certain classes of methods. That allowed us to understand current theoretical gaps, which we successfully filled by providing new efficient algorithms.

Identiferoai:union.ndltd.org:kaust.edu.sa/oai:repository.kaust.edu.sa:10754/676647
Date03 1900
CreatorsSokolov, Igor
ContributorsRichtarik, Peter, Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Wang, Di, Moshkov, Mikhail
Source SetsKing Abdullah University of Science and Technology
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Rights2024-04-27, At the time of archiving, the student author of this thesis opted to temporarily restrict access to it. The full text of this thesis will become available to the public after the expiration of the embargo on 2024-04-27.

Page generated in 0.2366 seconds