
Hamsa Bastani
· Associate Professor of Operations, Information and Decisions, Associate Professor of Statistics and Data Science (secondary)VerifiedUniversity of Pennsylvania · Business Economics and Public Policy
Active 1988–2026
About
Hamsa Bastani is an Associate Professor of Operations, Information, and Decisions at the Wharton School, University of Pennsylvania. Her research focuses on developing novel machine learning algorithms for data-driven decision-making, with applications spanning healthcare operations, social good, and revenue management. She has received several recognitions for her work, including the Wagner Prize for Excellence in Practice in 2021, the Pierskalla Award for the best paper in healthcare in 2016, 2019, and 2021, and the Behavioral Operations Management Best Paper Award in 2021. Bastani's work also includes significant contributions to healthcare systems, such as improving access to essential medicines in low- and middle-income countries through decision-aware machine learning frameworks, and optimizing pharmaceutical supply chains using machine learning for demand forecasting. Her research extends to algorithmic fairness in human-AI collaboration, clinical trial design with surrogate outcomes, and learning across high-dimensional bandit problems. She completed her PhD at Stanford University and spent a year as a Herman Goldstine postdoctoral fellow at IBM Research.
Research topics
- Pathology
- Environmental health
- Business
- Medicine
Selected publications
Winner's Curse Drives False Promises in Data-Driven Decisions: A Case Study in Refugee Matching
arXiv (Cornell University) · 2026-02-09
articleOpen access1st authorCorrespondingA major challenge in data-driven decision-making is accurate policy evaluation-i.e., guaranteeing that a learned decision-making policy achieves the promised benefits. A popular strategy is model-based policy evaluation, which estimates a model from data to infer counterfactual outcomes. This strategy is known to produce unwarrantedly optimistic estimates of the true benefit due to the winner's curse. We searched the recent literature on data-driven decision-making, identifying a sample of 55 papers published in the Management Science in the past decade; all but two relied on this flawed methodology. Several common justifications are provided: (1) the estimated models are accurate, stable, and well-calibrated, (2) the historical data uses random treatment assignment, (3) the model family is well-specified, and (4) the evaluation methodology uses sample splitting. Unfortunately, we show that no combination of these justifications avoids the winner's curse. First, we provide a theoretical analysis demonstrating that the winner's curse can cause large, spurious reported benefits even when all these justifications hold. Second, we perform a simulation study based on the recent and consequential data-driven refugee matching problem. We construct a synthetic refugee matching environment (calibrated to closely match the real setting) but designed so that no assignment policy can improve expected employment compared to random assignment. Model-based methods report large, stable gains of around 60% even when the true effect is zero; these gains are on par with improvements of 22-75% reported in the literature. Our results provide strong evidence against model-based evaluation.
Are AI Capabilities Increasing Exponentially? A Competing Hypothesis
ArXiv.org · 2026-02-04
articleOpen accessRapidly increasing AI capabilities have substantial real-world consequences, ranging from AI safety concerns to labor market consequences. The Model Evaluation & Threat Research (METR) report argues that AI capabilities have exhibited exponential growth since 2019. In this note, we argue that the data does not support exponential growth, even in shorter-term horizons. Whereas the METR study claims that fitting sigmoid/logistic curves results in inflection points far in the future, we fit a sigmoid curve to their current data and find that the inflection point has already passed. In addition, we propose a more complex model that decomposes AI capabilities into base and reasoning capabilities, exhibiting individual rates of improvement. We prove that this model supports our hypothesis that AI capabilities will exhibit an inflection point in the near future. Our goal is not to establish a rigorous forecast of our own, but to highlight the fragility of existing forecasts of exponential growth.
Winner's Curse Drives False Promises in Data-Driven Decisions: A Case Study in Refugee Matching
SSRN Electronic Journal · 2026-01-01
preprintOpen access1st authorCorrespondingImproving access to essential medicines via decision-aware machine learning
Nature · 2026-04-29
articleWinner's Curse Drives False Promises in Data-Driven Decisions: A Case Study in Refugee Matching
Open MIND · 2026-02-09
preprint1st authorCorrespondingA major challenge in data-driven decision-making is accurate policy evaluation-i.e., guaranteeing that a learned decision-making policy achieves the promised benefits. A popular strategy is model-based policy evaluation, which estimates a model from data to infer counterfactual outcomes. This strategy is known to produce unwarrantedly optimistic estimates of the true benefit due to the winner's curse. We searched the recent literature on data-driven decision-making, identifying a sample of 55 papers published in the Management Science in the past decade; all but two relied on this flawed methodology. Several common justifications are provided: (1) the estimated models are accurate, stable, and well-calibrated, (2) the historical data uses random treatment assignment, (3) the model family is well-specified, and (4) the evaluation methodology uses sample splitting. Unfortunately, we show that no combination of these justifications avoids the winner's curse. First, we provide a theoretical analysis demonstrating that the winner's curse can cause large, spurious reported benefits even when all these justifications hold. Second, we perform a simulation study based on the recent and consequential data-driven refugee matching problem. We construct a synthetic refugee matching environment (calibrated to closely match the real setting) but designed so that no assignment policy can improve expected employment compared to random assignment. Model-based methods report large, stable gains of around 60% even when the true effect is zero; these gains are on par with improvements of 22-75% reported in the literature. Our results provide strong evidence against model-based evaluation.
Data from: Improving access to essential medicines via decision-aware machine learning
DRYAD · 2026-02-09
datasetOpen accessA critical challenge in healthcare systems in Low- and Middle-Income Countries (LMICs) is the efficient and equitable allocation of scarce resources, particularly essential medicines. This problem is complicated by limited high-quality data, which restricts the applicability of traditional data-driven techniques. We propose a novel decision-aware machine learning framework for essential medicines allocation, which additionally leverages multi-task learning to ensure sample efficiency and catalytic priors to ensure equitable allocation. In collaboration with the Sierra Leone national government, we performed a staggered, nationwide deployment of our system as a decision support tool and evaluated its impact using synthetic difference-in-differences. We find an estimated 19% increased consumption of allocated products in treated districts, demonstrating its efficacy at improving access to essential medicines. Our tool was subsequently scaled nationwide, covering an estimated 2 million women and children under five. Our work demonstrates how machine learning methods can improve efficiency at very low cost in resource-constrained global health settings.
Effective Personalized AI Tutors via LLM-Guided Reinforcement Learning
SSRN Electronic Journal · 2026-01-01
preprintOpen accessAre AI Capabilities Increasing Exponentially? A Competing Hypothesis
Open MIND · 2026-02-04
preprintRapidly increasing AI capabilities have substantial real-world consequences, ranging from AI safety concerns to labor market consequences. The Model Evaluation & Threat Research (METR) report argues that AI capabilities have exhibited exponential growth since 2019. In this note, we argue that the data does not support exponential growth, even in shorter-term horizons. Whereas the METR study claims that fitting sigmoid/logistic curves results in inflection points far in the future, we fit a sigmoid curve to their current data and find that the inflection point has already passed. In addition, we propose a more complex model that decomposes AI capabilities into base and reasoning capabilities, exhibiting individual rates of improvement. We prove that this model supports our hypothesis that AI capabilities will exhibit an inflection point in the near future. Our goal is not to establish a rigorous forecast of our own, but to highlight the fragility of existing forecasts of exponential growth.
Online Learning with Survival Data
SSRN Electronic Journal · 2026-01-01
preprintOpen accessOptimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity
SSRN Electronic Journal · 2025-01-01 · 1 citations
preprintOpen access
Frequent coauthors
- 20 shared
Osbert Bastani
California University of Pennsylvania
- 14 shared
Mohsen Bayati
- 7 shared
Joel Goh
National University of Singapore
- 6 shared
Kimon Drakopoulos
University of Southern California
- 6 shared
Kan Xu
Chinese Academy of Medical Sciences & Peking Union Medical College
- 4 shared
David Simchi‐Levi
- 4 shared
John Silberholz
Ross School
- 4 shared
Ruihao Zhu
Cornell University
Labs
Operations, Information and Decisions DepartmentPI
Awards & honors
- Wagner Prize for Excellence in Practice (2021)
- Pierskalla Award for the best paper in healthcare (2016, 201…
- Behavioral OM Best Paper Award (2021)
- First place in the George Nicholson and MSOM student paper c…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Hamsa Bastani
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup