“Gradient Clipping Improves AdaGrad when the Noise Is Heavy-Tailed” – joint work with Savelii Chezhegov, Yaroslav Klyukin, Aleksandr Beznosikov, Alexander Gasnikov, Samuel Horváth, Martin Takáč and Eduard Gorbunov.
“Gradient Clipping Improves AdaGrad when the Noise Is Heavy-Tailed” – joint work with Savelii Chezhegov, Yaroslav Klyukin, Aleksandr Beznosikov, Alexander Gasnikov, Samuel Horváth, Martin Takáč and Eduard Gorbunov.