« On Step Size Choices in Stochastic and Mini-batch Gradient Descent
June 05, 2024, 10:20 AM - 11:05 AM
Location:
DIMACS Center
Rutgers University
CoRE Building
96 Frelinghuysen Road
Piscataway, NJ 08854
Click here for map.
Elizaveta Rebrova, Princeton University
First, I will talk about linear regression (and a little about ReLU regression). I will discuss robust stochastic gradient under the adversarial corruptions scenario and explain why exponentially decaying step size can be the right choice to ensure convergence. Then, for the least squares regression, I will discuss the connection between decreasing the mini-batch size when sampling without replacement, and decreasing the step size. These two changes have very similar effect on the convergence dynamic, but with subtle distinguishing effects that we propose to study via careful analysis of a certain anticommutator between sample covariance submatrices of the features. Based on the joint work with H. Jeong, D. Needell, J. Lok, and R. Sonthalia