Hi, I am a Senior Researcher at Microsoft Research AI Frontiers.
I received my PhD at MIT Department of EECS (Electrical Engineering & Computer Science) in 2024 Summer. My advisors were Professors Suvrit Sra and Ali Jadbabaie.
Some recent highlights:
–NeurIPS 2024: Adam with model exponential moving average is effective for nonconvex optimization
–ICLR 2024: Linear attention is (maybe) all you need (to understand transformer optimization)
–ICML 2024: Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
–NeurIPS 2023: Transformers learn to implement preconditioned gradient descent for in-context learning
–NeurIPS 2023: Learning threshold neurons via edge-of-stability
–NeurIPS OTML 2023: SpecTr++: Improved transport plans for speculative decoding of large language models
Work / Visiting (during PhD):
– Google Research, New York (Summer 2021)
PhD Research Intern in Learning Theory Team
Mentors: Prateek Jain, Satyen Kale, Praneeth Netrapalli, Gil Shamir
– Google Research, New York (Summer&Fall 2023)
PhD Research Intern in Speech & Language Algorithms Team
Mentors: Ziteng Sun, Ananda Theertha Suresh, Ahmad Beirami
– Simons Institute, “Geometric Methods in Optimization and Sampling“, Berkeley CA (Fall 2021)
Visiting Graduate Student
Master Thesis:
– From Proximal Point Method to Accelerated Methods on Riemannian Manifolds