Scaling Limits of Neural Networks

June 07, 2024, 11:35 AM - 12:20 PM

Location:

DIMACS Center

Rutgers University

CoRE Building

96 Frelinghuysen Road

Piscataway, NJ 08854

Click here for map.

Boris Hanin, Princeton University

Large neural networks are often studied analytically through scaling limits: regimes in which taking some structural network parameters (e.g. depth, width, number of training datapoints, and so on) to infinity results in simplified models of network properties. I will survey several such approaches, starting with the NTK regime in which network width tends to infinity at fixed depth and dataset size. Here, networks are Gaussian processes at initialization and are equivalent to linear models (at least for regression tasks). While this regime is tractable, it precludes a study of feature learning. The deviation from this NTK regime is controlled at finite width by the depth-to-width ratio, which plays the role of the effective network depth. I will explain how this occurs and state several results on how this effective depth affects learning in neural networks.

 

[Video]