Adam: A Method for Stochastic Optimization https://arxiv.org/pdf/1412.6980.pdf Diederik P. Kingma, Jimmy Ba ADAM, Stochastic gradient descent