I am a Research Scientist at Facebook AI Research (FAIR) where I work on Computer Vision and Machine Learning. My research interest is in reducing the need for supervision in visual learning. I finished my PhD at the Robotics Institute at Carnegie Mellon University where I worked with Martial Hebert and Abhinav Gupta. My PhD Thesis was titled “Visual Learning with Minimal Human Supervision” for which I received the SCS Distinguished Dissertation Award (Runner Up) 2018. For my work in self-supervised learning, I was featured in the MIT Tech Review’s 35 innovators under 35 list (compiled globally across technological disciplines). You can hear me on Lex Fridman’s podcast for a quick overview of my work.
News
- [2022] 2 papers accepted at ECCV 2022
- [2022] Keynote talk at the Ghost Day ML Conference, 2022
- [2022] 2 papers accepted at CVPR 2022
- [2022] Omnivore: a single model for image, video and 3D classification. Performs better than modality-specific models
- [2021] Guest on the Lex Fridman Podcast
- [2022] 1 paper accepted at ICLR 2022
- [2021] 1 paper accepted at NeurIPS 2021 (oral)
- [2021] 6 papers accepted at ICCV 2021 (3 as oral)
- [2021] Our CVPR 2021 paper (AVID) on Audio-Visual Self-supervised learning is a Best Paper Candidate.
- [2021] 3 papers accepted at CVPR 2021, 1 paper at ICML 2021.
- [2021] Co-wrote a blog on self-supervised learning with Yann LeCun [link].
- [2021] SEER scales self-supervised learning to billions of images.
- [2020] Our self-supervised technique called SwAV outperforms supervised pre-training on ALL considered transfer tasks and is the first method to do so.
Collaborators and Interns
- Xingyi Zhou (University of Texas, Austin). Hosted at FAIR with Rohit Girdhar and Armand Joulin.
- Bowen Cheng (University of Illinois, Urbana Champaign). Hosted at FAIR with Rohit Girdhar and Alex Kirillov
- Zaiwei Zhang (University of Texas, Austin). Hosted at FAIR with Rohit Girdhar and Armand Joulin.
- Zhongzheng (Jason) Ren (University of Illinois, Urbana Champaign). Hosted at FAIR with Rohit Girdhar.
- Yuki Asano (University of Oxford). Hosted at FAIR with Armand Joulin, Piotr Bojanowski, and Andrea Vedaldi.
- Pedro Morgado (University of California, San Diego).
- Huaizu Jiang (University of Massachusetts, Amherst). Hosted at FAIR with Xinlei Chen and Marcus Rohrbach.
- Jyh-Jing Hwang (University of California, Berkeley). Hosted at FAIR with Laurens van der Maaten.
- Yan Wang (Cornell University). Hosted at FAIR with Laurens van der Maaten.
- Terrance de Vries (University of Guelph). Hosted at FAIR with Laurens van der Maaten.
Publications
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
Seeing through the Human Reporting Bias:
Visual Classifiers from Noisy Human-Centric Labels
Visual Classifiers from Noisy Human-Centric Labels
CVPR 2016
Visual Storytelling
NAACL 2016
Applying artificial vision models to human scene understanding
Journal of Frontiers in Computational Neuroscience, 2015
Patents
Optimizing multi-class multimedia data classification using negative data
Optimizing multi-class image classification using patch features