Global ETD Search

1	Non-convex Stochastic Optimization With Biased Gradient Estimators Sokolov, Igor 03 1900 (has links) Non-convex optimization problems appear in various applications of machine learning. Because of their practical importance, these problems gained a lot of attention in recent years, leading to the rapid development of new efﬁcient stochastic gradient-type methods. In the quest to improve the generalization performance of modern deep learning models, practitioners are resorting to using larger and larger datasets in the training process, naturally distributed across a number of edge devices. However, with the increase of trainable data, the computational costs of gradient-type methods increase signiﬁcantly. In addition, distributed methods almost invariably suffer from the so-called communication bottleneck: the cost of communication of the information necessary for the workers to jointly solve the problem is often very high, and it can be orders of magnitude higher than the cost of computation. This thesis provides a study of ﬁrst-order stochastic methods addressing these issues. In particular, we structure this study by considering certain classes of methods. That allowed us to understand current theoretical gaps, which we successfully ﬁlled by providing new efﬁcient algorithms. non-convex optimization stochastic gradient methods distributed learning communication compression error feedback variance reduction