A disciplined approach to neural network hyper-parameters
Recommendations on how to optimize learning rate, weight decay, momentum and batch size.
ArXiV: https://arxiv.org/pdf/1803.09820.pdf
Recommendations on how to optimize learning rate, weight decay, momentum and batch size.
ArXiV: https://arxiv.org/pdf/1803.09820.pdf