A disciplined approach to neural network hyper-parameters



Recommendations on how to optimize learning rate, weight decay, momentum and batch size.



ArXiV: https://arxiv.org/pdf/1803.09820.pdf