Taming the Devil of Gradient-based Optimization Methods with the Angel of Differential Equations

Wednesday, November 07, 2018

12:00 pm - 1:00 pm

Physics 119

Weijie Su (University of Pennsylvania)

Applied Math And Analysis Seminar

This talk introduces a framework that uses ordinary differential equations to model, analyze, and interpret gradient-based optimization methods. In the first part of the talk, we derive a second-order ODE that is the limit of Nesterovs accelerated gradient method for non-strongly objectives (NAG-C). The continuous-time ODE is shown to allow for a better understanding of NAG-C and, as a byproduct, we obtain a family of accelerated methods with similar convergence rates. In the second part, we begin by recognizing that existing ODEs in the literature are inadequate to distinguish between two fundamentally different methods, Nesterovs accelerated gradient method for strongly convex functions (NAG-SC) and Polyaks heavy-ball method. In response, we derive high-resolution ODEs as more accurate surrogates for the three aforementioned methods. These novel ODEs can be integrated into a general framework that allows for a fine-grained analysis of the discrete optimization algorithms through translating properties of the amenable ODEs into those of their discrete counterparts. As the first application of this framework, we identify the effect of a term referred to as gradient correction in NAG-SC but not in the heavy-ball method, shedding insight into why the former achieves acceleration while the latter does not. Moreover, in this high-resolution ODE framework, NAG-C is shown to boost the squared gradient norm minimization at the inverse cubic rate, which is the sharpest known rat

Taming the Devil of Gradient-based Optimization Methods with the Angel of Differential Equations

Event Calendar