I am a Research Scientist in the GenAI Research group at Meta. My current research focuses on understanding and generating multimodal data, using minimal human supervision. I obtained a MS and PhD in Robotics from Carnegie Mellon University (here’s a link to my dissertation), where I worked on learning from and understanding videos. I was previously part of the Facebook AI Research (FAIR) group at Meta, and have spent time at DeepMind, Adobe and Facebook as an intern. See here for a formal bio.
PhD in Robotics, 2019
Carnegie Mellon University, Pittsburgh PA
MS in Robotics, 2016
Carnegie Mellon University, Pittsburgh PA
B. Tech. in Computer Science, 2014
IIIT Hyderabad, India
Meta · Research Scientist
New York · 2019 -- Present
DeepMind · Research Scientist Intern
London · Summer 2018
Facebook · Research Scientist Intern
Menlo Park · Summer 2017
Adobe · Research Scientist Intern
San Francisco · Summer 2016
Facebook · Software Engineering Intern
Menlo Park · Summer 2013
Videos powered by Emu Video!
Video-language embeddings are a promising avenue for injecting semantics into visual representations, but existing methods capture only short-term associations between seconds-long video clips and their accompanying text. We propose HierVL, a novel hierarchical video-language embedding that simultaneously accounts for both long-term and short-term associations.