My interests lie in studying how language models can be usefully integrated into real world workflows. These days, I've been thinking about inference-time compute, calibrating uncertainty in LMs, and how humans interact with LM agents.
You can reach me at ishapuri@mit.edu. I'd love to hear from you!
News
- [Aug 2025] Presenting @ Stanford 8/7, lmk if in the area!
- [July 2025] New Paper (Beyond Binary Rewards)! When we reward only correctness, LLMs hallucinate. We trained LLMs to analyze uncertainty and better calibrate their confidence!
- [July 2025] Invited talk at Cohere AI
- [June 2025] Thrilled to be moving to SF this summer to join the team at Abridge working on AI for Healthcare!
- [June 2025] Talk @ the University of Cambridge
- [April 2025] Talk @ MIT Embodied Intelligence Seminar
- [Feb 2025] Excited to serve as Associate Program Chair for NeurIPS 2025!
-
[Feb 2025] Giving invited talks at RedHat (2/7/25) and IBM Research AI (2/20/25) on scaling LMs with probabilistic inference. [Recording]
Links to Join: - [Feb 2025] Our paper on probabilistic inference for LM scaling is now on arXiv! Check out the project website.
- [Dec 2024] Our work on studying bias in LLMs for mental health support was covered by MIT News and published in EMNLP Findings.
Excited to see our Beyond Binary Rewards paper referenced in OpenAI’s latest work on hallucinations - it's important to incentivize models to express uncertainty, not just guess! https://t.co/puIXXZ3VCK
— Isha Puri (@ishapuri101) September 5, 2025
It seems GPT‑OSS is very prone to hallucinations … check out our RLCR paper to see how we trained reasoning models to know what they don't know. Website 🌐 and code 💻 out today! https://t.co/YqLu92enIy 🚀 pic.twitter.com/GInwRViz8y
— Isha Puri (@ishapuri101) August 6, 2025
[1/x] can we scale small, open LMs to o1 level? Using classical probabilistic inference methods, YES! Joint @MIT_CSAIL / @RedHat AI Innovation Team work introduces a particle filtering approach to scaling inference w/o any training! check out scaling.github.io pic.twitter.com/jcAxIRyypU
— Isha Puri (@ishapuri101)
Publications
Papers
-
Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty
Mehul Damani*, Isha Puri*, Stewart Slocum, Idan Shenfeld, Leshem Choshen, Yoon Kim, Jacob Andreas
SCALR@COLM Oral, LLM Explainability for Reasoning & Planning @ COLM, 2025
[ArXiv] [Tweet] [Code] [Project Website] -
MedPAIR: Measuring Physicians and AI Relevance Alignment in Medical Question Answering
Yuexing Hao, Kumail Alhamoud, Hyewon Jeong, Haoran Zhang, Isha Puri, Philip Torr, Mike Schaekermann, Ariel D. Stern, Marzyeh Ghassemi
KDD Workshop on Trust & Evaluation in GenAI, COLM Socially Responsible LMs Workshop, 2025
[Project Website] [ArXiv] [Code] -
Rollout Roulette: A Probabilistic Inference Approach to LLM Inference-Time Scaling with Particle-Based Monte Carlo Methods
Isha Puri, Shiv Sudalairaj, GX Xu, Kai Xu, Akash Srivastava
SCALR@COLM 2025
Talks @ Stanford, University of Cambridge, MIT EI Seminar, IBM Research AI, Cohere AI
[Project Website] [ArXiv] [Tweet] [Code] [Talk] [Poster] -
Can AI Relate: Testing Large Language Model Response for Mental Health Support
Saadia Gabriel, Isha Puri, Xuhai Xu, Matteo Malgaroli, Marzyeh Ghassemi
EMNLP Findings, 2024
[MIT News] [ArXiv] [Code] -
LLM Test Time Augmentation: Improving Black-box Robustness with In-Context Rewriting
Kyle O'Brien, Nathan Ng, Isha Puri, Jorge Mendez, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi, Thomas Hartvigsen
Transactions on Machine Learning Research (TMLR), 2024
[ArXiv] [Code] -
Advancing Equality: Harnessing Generative AI to Combat Systemic Racism
Saadia Gabriel, Jessy Xinyi Han, Eric Liu, Isha Puri, Wonyoung So, Fotini Christia, Munzer Dahleh, Catherine D’Ignazio, Marzyeh Ghassemi, Peko Hosoi, Devavrat Shah
An MIT Exploration of Generative AI, 2024 -
Evaluating the Causal Reasoning Capabilities of Language Models
Isha Puri, Hima Lakkaraju
ICML Knowledge & Reasoning Workshop, 2023 -
OpenXAI: Towards a Transparent Evaluation of Model Explanations
Chirag Agarwal, Satyapriya Krishna, Eshika Saxena, Martin Pawelczyk, Nari Johnson, Isha Puri, Marinka Zitnik, and Himabindu Lakkaraju
NeurIPS, 2022 ICLR Pair2Struct Workshop Oral Presentation
[Github] [Website] -
CoFrNets: Interpretable Neural Architecture Inspired by Continued Fractions
Isha Puri, Amit Dhurandhar, Tejaswini Pedapati, Kartikeyan Shanmugam, Dennis Wei, and Kush R. Varshney
NeurIPS, 2021
[Code on AIX360] [ArXiv] [Video] [Supplement] -
Reconsidering the Algorithmic Fairness of Race Adjustment in Pulmonary Function Equations
Isha Puri*, Neil Sehgal*, Usha Bhalla*.
AAAI AI For Social Good Workshop, 2023 -
A System for Accurate Tracking and Video Recordings of Rodent Eye Movements using Convolutional Neural Networks for Biomedical Image Segmentation
Isha Puri and David Cox
IEEE Engineering in Medicine and Biology Conference, 2018
Other Projects
-
DYSCERN - A Scalable, Freely Accessible Machine Learning Application for the Early Detection of Dyslexia
[Project Video + Demo]
Awards / Honors
- MIT Great Educators Fellowship
(MIT EECS) - National Science Foundation Graduate Research Fellowship
- Harvard Technology Innovation Fellow
(Harvard Business School) - Kempner Institute Graduate Fellowship (6 years, fully funded PhD) (Declined)
(Harvard University) - Gordon Wu Fellowship (5 year fellowship) (Declined)
(Princeton University) - Derek Bok Award for Distinction in Teaching
(Harvard University) - ACM Cutler Bell Prize for Excellence in Computing Research
(Association of Computing Machinery) (4 selected nationwide) - Davidson Fellowship
(Recognized by Forbes as “World’s Most Prestigious Undergraduate Scholarship”) - Donald and Kathleen Pfister Prize for Excellence in the Sciences
(Harvard University) - ThermoFisher Scientific Collegiate Fellowship
(6 chosen nationwide) - NCWIT Collegiate Prize
(National Center for Women in Information Technology, 6 selected nationwide) - Coca Cola Scholarship
(250 chosen from > 90,000 applications) - Google Science Fair Global Finalist (20 worldwide)
- Intel ISEF Grand Award Winner (2nd Place)
(International Science and Engineering Fair 2018) - National Security Agency Mathematics Honor Award
Personal Things!
Outside of research, I:
- play the harp! I've recently been getting into playing harp adaptions of Bollywood and Disney songs
- love learning about history and political science- when I was little, my parents thought I would grow up to be a historian. Am currently reading about the Gilded Age and the WWII Science Industrial Age
- love late night comedy - I've seen several live tapings of every late night show except Trevor Noah, who I've seen on tour twice. My multiverse dream would be to intern for The Late Show, Last Week Tonight, or The Daily Show for a summer.
- collect postcards! I've been assembling my postcard wall since I was 15. It now has nearly 650 postcards - each inscribed with the date, a short story from the day, and signatures of who was there:)