Isha Puri

Hi, I'm Isha! I'm a PhD student at MIT CSAIL working on language models, AI agents, and machine learning. I'm lucky to be coadvised by Yoon Kim and Marzyeh Ghassemi. My PhD is funded by MIT's Great Educators Fellowship and the NSF's Graduate Research Fellowship. I previously served as Co-President of MIT's Graduate Women in EECS. (I also help co-organize the MIT NLP Seminar - check it out!) I am honored to serve as an Associate Program Chair for NeurIPS 2025.

I graduated from Harvard University in 2023 with a B.A. in Applied Math and Computer Science. I worked with Hima Lakkaraju at AI4LIFE and was lucky to have been a part of the Harvard Business School Tech Innovation Fellowship. I was most recently at Abridge working on AI and Product.

My interests lie in studying how language models can be usefully integrated into real world workflows. I have previous worked on inference time scaling - these days, I've been thinking about using RL to calibrate uncertainty in LMs and how that uncertainty shapes human collaboration with LM agents.

You can reach me at ishapuri@mit.edu. I'd love to hear from you!

News

[Nov '25] Talks about RLCR @ NVIDIA, UT Austin
[Oct '25] Spotlight talk about RLCR @ COLM workshop!
[Aug '25] Presenting @ Stanford 8/7, lmk if in the area!
[July '25] New Paper (Beyond Binary Rewards)! When we reward only correctness, LLMs hallucinate. We trained LLMs to analyze uncertainty and better calibrate their confidence!
[July '25] Invited talk at Cohere AI
[June '25] Thrilled to be moving to SF this summer to join the team at Abridge working on AI for Healthcare!
[June '25] Talk @ the University of Cambridge
[April '25] Talk @ MIT Embodied Intelligence Seminar
[Feb '25] Excited to serve as Associate Program Chair for NeurIPS 2025!
[Feb '25] Giving invited talks at RedHat (2/7/25) and IBM Research AI (2/20/25) on scaling LMs with probabilistic inference. [Recording]

Links to Join:
[Feb '25] Our paper on probabilistic inference for LM scaling is now on arXiv! Check out the project website.
[Dec '24] Our work on studying bias in LLMs for mental health support was covered by MIT News and published in EMNLP Findings.

Excited to see our Beyond Binary Rewards paper referenced in OpenAI’s latest work on hallucinations - it's important to incentivize models to express uncertainty, not just guess! https://t.co/puIXXZ3VCK
— Isha Puri (@ishapuri101) September 5, 2025

It seems GPT‑OSS is very prone to hallucinations … check out our RLCR paper to see how we trained reasoning models to know what they don't know. Website 🌐 and code 💻 out today! https://t.co/YqLu92enIy 🚀 pic.twitter.com/GInwRViz8y
— Isha Puri (@ishapuri101) August 6, 2025

[1/x] can we scale small, open LMs to o1 level? Using classical probabilistic inference methods, YES! Joint @MIT_CSAIL / @RedHat AI Innovation Team work introduces a particle filtering approach to scaling inference w/o any training! check out scaling.github.io pic.twitter.com/jcAxIRyypU
— Isha Puri (@ishapuri101)

Publications

Papers

Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty
Mehul Damani*, Isha Puri*, Stewart Slocum, Idan Shenfeld, Leshem Choshen, Yoon Kim, Jacob Andreas
*=equal contribution
ICLR 2026, SCALR@COLM ⭐Spotlight Oral⭐ 2025
⭐Talks @ NVIDIA, UT Austin, COLM⭐
[ArXiv] [Tweet] [Code] [Project Website] [Slides]

MedPAIR: Measuring Physicians and AI Relevance Alignment in Medical Question Answering
Yuexing Hao, Kumail Alhamoud, Hyewon Jeong, Haoran Zhang, Isha Puri, Philip Torr, Mike Schaekermann, Ariel D. Stern, Marzyeh Ghassemi
KDD Workshop on Trust & Evaluation in GenAI, COLM Socially Responsible LMs Workshop, 2025
[Project Website] [ArXiv] [Code]

Rollout Roulette: A Probabilistic Inference Approach to LLM Inference-Time Scaling with Particle-Based Monte Carlo Methods
Isha Puri, Shiv Sudalairaj, GX Xu, Kai Xu, Akash Srivastava
NeurIPS 2025, SCALR@COLM 2025
⭐Talks @ Stanford, University of Cambridge, MIT EI Seminar, IBM Research AI, Cohere AI⭐
[Project Website] [ArXiv] [Tweet] [Code] [Talk] [Poster] [Slides]

Can AI Relate: Testing Large Language Model Response for Mental Health Support
Saadia Gabriel, Isha Puri, Xuhai Xu, Matteo Malgaroli, Marzyeh Ghassemi
EMNLP Findings, 2024
[MIT News] [ArXiv] [Code]

LLM Test Time Augmentation: Improving Black-box Robustness with In-Context Rewriting
Kyle O'Brien, Nathan Ng, Isha Puri, Jorge Mendez, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi, Thomas Hartvigsen
Transactions on Machine Learning Research (TMLR), 2024
[ArXiv] [Code]

Advancing Equality: Harnessing Generative AI to Combat Systemic Racism
Saadia Gabriel, Jessy Xinyi Han, Eric Liu, Isha Puri, Wonyoung So, Fotini Christia, Munzer Dahleh, Catherine D’Ignazio, Marzyeh Ghassemi, Peko Hosoi, Devavrat Shah
An MIT Exploration of Generative AI, 2024

Evaluating the Causal Reasoning Capabilities of Language Models
Isha Puri, Hima Lakkaraju
ICML Knowledge & Reasoning Workshop, 2023

OpenXAI: Towards a Transparent Evaluation of Model Explanations
Chirag Agarwal, Satyapriya Krishna, Eshika Saxena, Martin Pawelczyk, Nari Johnson, Isha Puri, Marinka Zitnik, and Himabindu Lakkaraju
NeurIPS, 2022 ICLR Pair2Struct Workshop Oral Presentation
[Github] [Website]

CoFrNets: Interpretable Neural Architecture Inspired by Continued Fractions
Isha Puri, Amit Dhurandhar, Tejaswini Pedapati, Kartikeyan Shanmugam, Dennis Wei, and Kush R. Varshney
NeurIPS, 2021
[Code on AIX360] [ArXiv] [Video] [Supplement]

Reconsidering the Algorithmic Fairness of Race Adjustment in Pulmonary Function Equations
Isha Puri*, Neil Sehgal*, Usha Bhalla*.
AAAI AI For Social Good Workshop, 2023

A System for Accurate Tracking and Video Recordings of Rodent Eye Movements using Convolutional Neural Networks for Biomedical Image Segmentation
Isha Puri and David Cox
IEEE Engineering in Medicine and Biology Conference, 2018

Other Projects

DYSCERN - A Scalable, Freely Accessible Machine Learning Application for the Early Detection of Dyslexia
[Project Video + Demo]

Awards / Honors

MIT Great Educators Fellowship
(MIT EECS)
National Science Foundation Graduate Research Fellowship
Harvard Technology Innovation Fellow
(Harvard Business School)
Kempner Institute Graduate Fellowship (6 years, fully funded PhD) (Declined)
(Harvard University)
Gordon Wu Fellowship (5 year fellowship) (Declined)
(Princeton University)
Derek Bok Award for Distinction in Teaching
(Harvard University)
ACM Cutler Bell Prize for Excellence in Computing Research
(Association of Computing Machinery) (4 selected nationwide)
Davidson Fellowship
(Recognized by Forbes as “World’s Most Prestigious Undergraduate Scholarship”)
Donald and Kathleen Pfister Prize for Excellence in the Sciences
(Harvard University)
ThermoFisher Scientific Collegiate Fellowship
(6 chosen nationwide)
NCWIT Collegiate Prize
(National Center for Women in Information Technology, 6 selected nationwide)
Coca Cola Scholarship
(250 chosen from > 90,000 applications)
Google Science Fair Global Finalist (20 worldwide)
Intel ISEF Grand Award Winner (2nd Place)
(International Science and Engineering Fair 2018)
National Security Agency Mathematics Honor Award

Personal Things!

Outside of research, I:

play the harp! I've recently been getting into playing harp adaptions of Bollywood and Disney songs
love learning about history and political science- when I was little, my parents thought I would grow up to be a historian. Am currently reading about the Gilded Age and the WWII Science Industrial Age
love late night comedy - I've seen several live tapings of every late night show except Trevor Noah, who I've seen on tour twice. My multiverse dream would be to intern for The Late Show, Last Week Tonight, or The Daily Show for a summer.
collect postcards! I've been assembling my postcard wall since I was 15. It now has nearly 650 postcards - each inscribed with the date, a short story from the day, and signatures of who was there:)