Return to Colloquia & Seminar listing
Learning Dynamics and Implicit Bias of Gradient Flow in Overparameterized Linear Models
Mathematics of Data & DecisionsSpeaker: | Rene Vidal, University of Pennsylvania |
Location: | 1025 PSEL |
Start time: | Tue, Apr 9 2024, 3:10PM |
Contrary to the common belief that overparameterization may hurt generalization and optimization, recent work suggests that overparameterization may bias the optimization algorithm towards solutions that generalize well — a phenomenon known as implicit regularization or implicit bias — and may also accelerate convergence — a phenomenon known as implicit acceleration. This talk will provide a detailed analysis of the dynamics of gradient flow in overparameterized two-layer linear models showing that convergence to equilibrium depends on the imbalance between input and output weights (which is fixed at initialization) and the margin of the initial solution. The talk will also provide an analysis of the implicit bias, showing that large hidden layer width, together with (properly scaled) random initialization, constrains the network parameters to converge to a solution which is close to the min-norm solution.