Koustuv Sinha

I’m a Research Scientist at Meta AI, in the Fundamental AI Research (FAIR) team. I received my PhD from McGill University (School of Computer Science) and Mila (Quebec AI Institute), Montreal, Canada, where I was advised by Dr. Joelle Pineau. I also received by MSc (Thesis) from McGill University, where I was advised by Dr. Derek Ruths and Dr. Joelle Pineau.

My current research interest involves investigating the role of multimodal language models at understanding and reasoning the world through rich visual representations, especially by leveraging language as a tool to decode and reason the underlying physical rules governing the world. My research also involves understanding the limits of systematic language understanding - the ability of neural systems to understand language in a human-like way - by evaluating the extent of their capabilities in understanding semantics, syntax and generalizability.

I am also involved in improving and enabling reproducibile research in Machine Learning - I’m the lead organizer of the annual Machine Learning Reproducibility Challenge (V1, V2, V3, V4, V5, V6, v7), and I serve as an associate editor at ReScience C, a peer reviewed journal promoting reproducible research. My work has been covered by several news outlets in the past, including Nature, VentureBeat, InfoQ, DailyMail and Hindustan Times.

Featured Publications

V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning; Mido Assran*, Adrien Bardes*, David Fan*, Quentin Garrido*, Russell Howes*, Mojtaba, Komeili*, Matthew Muckley*, Ammar Rizvi*, Claire Roberts*, Koustuv Sinha*, Artem Zholus*, Sergio Arnaud*, Abha Gejji*, Ada Martin*, Francois Robert Hogan*, Daniel Dugas*, Piotr Bojanowski, Vasil Khalidov, Patrick Labatut, Francisco Massa, Marc Szafraniec, Kapil Krishnakumar, Yong Li, Xiaodong Ma, Sarath Chandar, Franziska Meier*, Yann LeCun*, Michael Rabbat*, Nicolas Ballas*; Huggingface | Code | Blog
Scaling Language-Free Visual Representation Learning; David Fan*, Shengbang Tong*, Jiachen Zhu, Koustuv Sinha, Zhuang Liu, Xinlei Chen, Michael Rabbat, Nicolas Ballas, Yann LeCun, Amir Bar, Saining Xie; Huggingface | Code | Project Page
MetaMorph: Multimodal Understanding and Generation via Instruction Tuning; Shengbang Tong*, David Fan*, Jiachen Zhu, Yunyang Xiong, Xinlei Chen, Koustuv Sinha, Michael Rabbat, Yann LeCun, Saining Xie, Zhuang Liu
VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning; Han Lin, Tushar Nagarajan, Nicolas Ballas, Mido Assran, Mojtaba Komeili, Mohit Bansal, Koustuv Sinha;
Chameleon: Mixed-Modal Early-Fusion Foundation Models; Chameleon Team; Huggingface | Code | Blog

For a full and up to date list of my publications please visit my Google Scholar profile.

News

[06/11/25] Excited to announce the release of our frontier video understanding model, VJEPA-2! Checkout our model code, weights and cool demos!
[06/11/25] Happy to release the Physical World Reasoning leaderboard, along with a new dataset, MVPBench (Paper) for assessing video understanding capabilities of modern VLMs.
[01/06/24] Serving as a Senior Area Chair at ACL 2024.
[16/05/24] Excited to announce our large multimodal language model, Chameleon, is now released on arxiv! We have also released the code and model weights, as per our Open Source Strategy!

Interns

I’m fortunate to be supervising / have supervised exceptionally strong interns, and always looking to support more students!

Peter Tong, Research Intern @ FAIR, Meta AI 2025-26; Peter also worked with me before, co-supervised with Mike Rabbat and Zhuang Liu as a Research Intern @ Meta AI, 2024
Weijia Shi, Research Intern, FAIR, Meta AI 2025
Jiachen Zhu, Visiting Researcher, FAIR, Meta AI 2025
Han Lin, Research Intern @ Meta AI, 2024
Benno Krojer, co-supervised with Nicolas Ballas and Mido Assran, Research Intern @ Meta AI, 2024
Karen Chen, co-supervised with Adriana Romero Soriano and Michal Drozdal, Research Intern @ Meta AI, 2024
Oscar Manas, co-supervised with Adriana Romero Soriano and Michal Drozdal, Research Intern @ Meta AI, 2024
Bhargavi Paranjape, Research Intern @ Meta AI, 2023
David Wan, co-supervised with Ram Pasunuru, Research Intern @ Meta AI, 2023
Song Jiang, co-supervised with Asli Celikyilmaz, Research Intern @ Meta AI, 2023
Jake Bremerman, co-supervised with Mingda Chen, Research Intern @ Meta AI, 2023
Kumar Shridhar, co-supervised with Jason Weston, Research Intern @ Meta AI, 2023
Silin Gao, co-supervised with Tianlu Wang, Research Intern @ Meta AI, 2023
Saeed Goodarzi, Nikhil Kagita & Dennis Minn, co-supervised with Adina Williams, Shubham Toshniwal and Jack Lanchatin, UMass Industry Mentorship Program with Meta, Summer of 2023

Invited Talks

Keynote, Reproducibility Tutorial, MICCAI 2023
Panelist, Reproducibility and Rigor in ML, ML Evaluation Standards Workshop at ICLR 2022, April 2022
Evaluating Logical Generalization with Graph Neural Networks, Weights and Biases Salon, (Online), May 2020
ML Reproducibility - From Theory to Practice, DL4Science Seminar, Lawrence Berkeley National Laboratory, Berkeley, (Online), August 2020; MICCAI Hackathon, Peru, 2020, October 2020; Bielefield University, Germany, hosted by Malte Schilling, October 2021

Featured Awards

Outstanding Paper Award, ACL 2022, Language model acceptability judgements are not always robust to context; Koustuv Sinha, Jon Gauthier, Aaron Mueller, Kanishka Misra, Keren Fuentes, Roger Levy, Adina Williams, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
Outstanding Paper Award, ACL 2021, UnNatural Language Inference, Koustuv Sinha, Prasanna Parthasarathi, Joelle Pineau, Adina Williams

Academic Responsibilities

Lead organizer, ML Reproducibility Challenge : (v1, v2, v3, v4, v5, v6, v7, v8)
Senior Area Chair: ACL 2024
Area Chair: COLM 2025, ACL ARR May 2025, ACL ARR February 2025 (ACL 2025), ACL ARR December 2024, ACL ARR June 2024
Journal Chair: NeurIPS 2022
Reproducibility Chair: NeurIPS 2019, NeurIPS 2020
Reviewer: ACL ARR Cycle Reviewer, ICCV 2025, CVPR 2025, NeurIPS 2024, COLM 2024, and many more …

Contact

Best place to reach out to me is through email. I’m on social media platforms but I rarely monitor them.

{firstname}.{lastname}@{mail.mgill.ca} / {firstnamelastname}@{gmail.com}
BlueSky | Twitter/X | LinkedIn | Github