Global ETD Search

Return to search

An Empirical Study of the Distributed Ellipsoidal Trust Region Method for Large Batch Training

Neural networks optimizers are dominated by first-order methods, due to their
inexpensive computational cost per iteration. However, it has been shown that firstorder optimization is prone to reaching sharp minima when trained with large batch
sizes. As the batch size increases, the statistical stability of the problem increases,
a regime that is well suited for second-order optimization methods. In this thesis,
we study a distributed ellipsoidal trust region model for neural networks. We use
a block diagonal approximation of the Hessian, assigning consecutive layers of the
network to each process. We solve in parallel for the update direction of each subset
of the parameters. We show that our optimizer is fit for large batch training as well
as increasing number of processes.

optimization

trust region

distributed computing

deep learning

machine learning

Identifer	oai:union.ndltd.org:kaust.edu.sa/oai:repository.kaust.edu.sa:10754/667327
Date	10 February 2021
Creators	Alnasser, Ali
Contributors	Keyes, David E., Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Wonka, Peter, Zhang, Xiangliang
Source Sets	King Abdullah University of Science and Technology
Language	English
Detected Language	English
Type	Thesis

Page generated in 0.0022 seconds

An Empirical Study of the Distributed Ellipsoidal Trust Region Method for Large Batch Training

Description

Links & Downloads

Tags

Additional Fields