Two Variants of Learning Single-index Models with SGD

June 06, 2024, 2:15 PM - 3:00 PM

Location:

DIMACS Center

Rutgers University

CoRE Building

96 Frelinghuysen Road

Piscataway, NJ 08854

Click here for map.

Denny Wu, New York University (NYU)

Single-index models are given by a univariate link function applied to a one-dimensional projection of the input. Recent works have shown that the statistical complexity of learning this function class with online SGD is governed by the information exponent of the link function. In this talk, we discuss two variations of prior analyses. First, we consider the learning of single-index polynomials via SGD, but with reused training data. We show that two-layer neural networks optimized by an SGD-based algorithm can learn this target with almost linear sample complexity, regardless of the information exponent; this complexity surpasses the CSQ lower bound and matches the information-theoretic limit up to polylogarithmic factors. Next, we introduce the class of additive models defined as the sum of M single-index models, with M diverging with dimensionality d. We study the sample complexity of SGD training and also provide SQ lower bounds for learning this function class; our analysis reveals the fundamental difference between the previously studied finite-M (multi-index) and our large-M setting. 


Based on joint works with Jason D. Lee, Kazusato Oko, Yujin Song, and Taiji Suzuki.

 

[Video]    [Slides]