MIT

About

Hi, I am a Senior Researcher at Microsoft Research AI Frontiers

I received my PhD at MIT Department of EECS (Electrical Engineering & Computer Science) in 2024 Summer. My advisors were Professors Suvrit Sra and Ali Jadbabaie.

Google Scholar Profile

Some recent highlights:
NeurIPS 2024: Adam with model exponential moving average is effective for nonconvex optimization
ICLR 2024: Linear attention is (maybe) all you need (to understand transformer optimization)
ICML 2024Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise 
NeurIPS 2023: Transformers learn to implement preconditioned gradient descent for in-context learning
NeurIPS 2023: Learning threshold neurons via edge-of-stability
NeurIPS OTML 2023: SpecTr++: Improved transport plans for speculative decoding of large language models

Work / Visiting (during PhD):

– Google Research, New York (Summer 2021)
PhD Research Intern in Learning Theory Team
Mentors: Prateek JainSatyen KalePraneeth NetrapalliGil Shamir
– Google Research, New York (Summer&Fall 2023)
PhD Research Intern in Speech & Language Algorithms Team
Mentors: Ziteng SunAnanda Theertha SureshAhmad Beirami

– Simons Institute, “Geometric Methods in Optimization and Sampling“, Berkeley CA (Fall 2021)
Visiting Graduate Student

Master Thesis:
 – From Proximal Point Method to Accelerated Methods on Riemannian Manifolds