Tyler Mccormick
· ProfessorVerifiedUniversity of Washington · Statistics
Active 1983–2026
About
Tyler H. McCormick is a professor at the University of Washington in the Departments of Statistics and Sociology. He is also a core faculty member of the Center for Statistics and the Social Sciences, a Senior Data Science Fellow at the eScience Institute, a research affiliate at the Center for Studies in Demography and Ecology, and a faculty partner in Responsible AI Systems & Experiences (RAISE). His research spans a variety of topics in statistics and data science, motivated by scientific questions in global health, economics, demography, and sociology. Recent projects include estimating features of social networks using data from standard surveys, inferring likely causes of death when deaths occur outside hospitals using reports from surviving caretakers, and quantifying and communicating uncertainty in predictive models for global health policymakers. He is involved in the development of OpenVA, a suite of open tools to manage and analyze verbal autopsy surveys. McCormick is the former Editor of the Journal of Computational and Graphical Statistics and was awarded the NIH Director’s New Innovator Award in 2019. In 2023, he was elected as a Fellow of the American Statistical Association. His work has been featured in the Wall Street Journal and the Washington Post, and he has contributed to discussions on the role of statistics in AI through webinars and short courses. His recent research includes a framework for measuring AI exposure and studying its health effects at the population level.
Research topics
- Data Mining
- Computer Science
- Mathematics
- Geography
- Statistics
- Machine Learning
- Econometrics
- Sociology
- Artificial Intelligence
- Theoretical computer science
- Physics
- Cartography
- Engineering
- Data science
- Business
- Socioeconomics
Selected publications
When you come to a fork in the road, take it: the Rashomon effect for social scientists
2026-02-28
articleSenior authorSocial science researchers have developed many statistical methods for determining which models to use for social inquiry. Conventional approaches rely on model selection, averaging over many models, or ``multiverse" techniques that assess robustness across many researcher-defined alternative specifications. We argue that these methods overlook the fundamental ambiguity inherent in data: even when researcher decisions are perfectly standardized, many distinct models can fit the data equally well. We propose addressing this ``model multiplicity" through the lens of the ``Rashomon effect," a term coined by Breiman (2001) to describe situations where many statistical models that imply different mechanistic explanations have near-equivalent performance over the same data. Recent advances in computational methods have created tools to completely enumerate full sets of such models for important model classes, called ``Rashomon sets." We demonstrate the inevitability of model multiplicity with a simulation study. We offer a set of post-fit criteria researchers can use to navigate model multiplicity through Rashomon sets, then demonstrate their utility with an empirical application, showing how different aggregations of religious categories yield distinct plausible explanations for life satisfaction. We contend that Rashomon set methods offer uniquely exhaustive opportunities for social inquiry, promoting more intellectually honest model selection while maintaining rigor.
REALITrees: Rashomon Ensemble Active Learning for Interpretable Trees
arXiv (Cornell University) · 2026-03-24
preprintOpen accessSenior authorActive learning reduces labeling costs by selecting samples that maximize information gain. A dominant framework, Query-by-Committee (QBC), typically relies on perturbation-based diversity by inducing model disagreement through random feature subsetting or data blinding. While this approximates one notion of epistemic uncertainty, it sacrifices direct characterization of the plausible hypothesis space. We propose the complementary approach: Rashomon Ensembled Active Learning (REAL) which constructs a committee by exhaustively enumerating the Rashomon Set of all near-optimal models. To address functional redundancy within this set, we adopt a PAC-Bayesian framework using a Gibbs posterior to weight committee members by their empirical risk. Leveraging recent algorithmic advances, we exactly enumerate this set for the class of sparse decision trees. Across synthetic and established active learning baselines, REAL outperforms randomized ensembles, particularly in moderately noisy environments where it strategically leverages expanded model multiplicity to achieve faster convergence.
Spatially Robust Inference with Predicted and Missing at Random Labels
arXiv (Cornell University) · 2026-03-11
preprintOpen accessSenior authorWhen outcome data are expensive or onerous to collect, scientists increasingly substitute predictions from machine learning and AI models for unlabeled cases, a process which has consequences for downstream statistical inference. While recent methods provide valid uncertainty quantification under independent sampling, real-world applications involve missing at random (MAR) labeling and spatial dependence. For inference in this setting, we propose a doubly robust estimator with cross-fit nuisances. We show that cross-fitting induces fold-level correlation that distorts spatial variance estimators, producing unstable or overly conservative confidence intervals. To address this, we propose a jackknife spatial heteroscedasticity and autocorrelation consistent (HAC) variance correction that separates spatial dependence from fold-induced noise. Under standard identification and dependence conditions, the resulting intervals are asymptotically valid. Simulations and benchmark datasets show substantial improvement in finite-sample calibration, particularly under MAR labeling and clustered sampling.
Randomized Recruitment Driven Sampling
arXiv (Cornell University) · 2026-02-27
articleOpen accessSenior authorSurveys are critical inputs for research and policy, yet, enumerating a sampling frame is logistically infeasible or financially nonviable in many circumstances, such as during pandemics, natural disasters, or armed conflict. Respondent Driven Sampling (RDS) does not require a sampling frame, yet non-random peer recruitment often introduces substantial bias, particularly under high homophily. We introduce and evaluate Randomized Recruitment Driven Sampling (RRDS), a cellphone-based adaptation of RDS that incorporates researcher-controlled randomization into each recruitment wave. While standard RDS is necessary for stigmatized groups where network transparency is infeasible, RRDS is designed for low-stigma populations that become difficult to access due to logistical barriers. In these contexts, RRDS enforces the random recruitment assumption that traditional RDS relies upon but rarely achieves. Through simulation and an experiment surveying Bangladeshi garment workers during the COVID-19 pandemic, we demonstrate that RRDS produces less biased estimates and improved confidence interval coverage compared to traditional RDS. RRDS offers a scalable, remote-compatible alternative for studying low-stigma groups in challenging contexts where large-scale probability sampling is unsafe or infeasible.
REALITrees: Rashomon Ensemble Active Learning for Interpretable Trees
arXiv (Cornell University) · 2026-03-24
articleOpen accessSenior authorActive learning reduces labeling costs by selecting samples that maximize information gain. A dominant framework, Query-by-Committee (QBC), typically relies on perturbation-based diversity by inducing model disagreement through random feature subsetting or data blinding. While this approximates one notion of epistemic uncertainty, it sacrifices direct characterization of the plausible hypothesis space. We propose the complementary approach: Rashomon Ensembled Active Learning (REAL) which constructs a committee by exhaustively enumerating the Rashomon Set of all near-optimal models. To address functional redundancy within this set, we adopt a PAC-Bayesian framework using a Gibbs posterior to weight committee members by their empirical risk. Leveraging recent algorithmic advances, we exactly enumerate this set for the class of sparse decision trees. Across synthetic and established active learning baselines, REAL outperforms randomized ensembles, particularly in moderately noisy environments where it strategically leverages expanded model multiplicity to achieve faster convergence.
Randomized Recruitment Driven Sampling
arXiv (Cornell University) · 2026-02-27
preprintOpen accessSenior authorSurveys are critical inputs for research and policy, yet, enumerating a sampling frame is logistically infeasible or financially nonviable in many circumstances, such as during pandemics, natural disasters, or armed conflict. Respondent Driven Sampling (RDS) does not require a sampling frame, yet non-random peer recruitment often introduces substantial bias, particularly under high homophily. We introduce and evaluate Randomized Recruitment Driven Sampling (RRDS), a cellphone-based adaptation of RDS that incorporates researcher-controlled randomization into each recruitment wave. While standard RDS is necessary for stigmatized groups where network transparency is infeasible, RRDS is designed for low-stigma populations that become difficult to access due to logistical barriers. In these contexts, RRDS enforces the random recruitment assumption that traditional RDS relies upon but rarely achieves. Through simulation and an experiment surveying Bangladeshi garment workers during the COVID-19 pandemic, we demonstrate that RRDS produces less biased estimates and improved confidence interval coverage compared to traditional RDS. RRDS offers a scalable, remote-compatible alternative for studying low-stigma groups in challenging contexts where large-scale probability sampling is unsafe or infeasible.
Adaptive Active Learning for Regression via Reinforcement Learning
arXiv (Cornell University) · 2026-03-11
articleOpen accessSenior authorActive learning for regression reduces labeling costs by selecting the most informative samples. Improved Greedy Sampling is a prominent method that balances feature-space diversity and output-space uncertainty using a static, multiplicative rule. We propose Weighted improved Greedy Sampling (WiGS), which replaces this framework with a dynamic, additive criterion. We formulate weight selection as a reinforcement learning problem, enabling an agent to adapt the exploration-investigation balance throughout learning. Experiments on 18 benchmark datasets and a synthetic environment show WiGS outperforms iGS and other baseline methods in both accuracy and labeling efficiency, particularly in domains with irregular data density where the baseline's multiplicative rule ignores high-error samples in dense regions.
Spatially Robust Inference with Predicted and Missing at Random Labels
ArXiv.org · 2026-03-11
articleOpen accessSenior authorWhen outcome data are expensive or onerous to collect, scientists increasingly substitute predictions from machine learning and AI models for unlabeled cases, a process which has consequences for downstream statistical inference. While recent methods provide valid uncertainty quantification under independent sampling, real-world applications involve missing at random (MAR) labeling and spatial dependence. For inference in this setting, we propose a doubly robust estimator with cross-fit nuisances. We show that cross-fitting induces fold-level correlation that distorts spatial variance estimators, producing unstable or overly conservative confidence intervals. To address this, we propose a jackknife spatial heteroscedasticity and autocorrelation consistent (HAC) variance correction that separates spatial dependence from fold-induced noise. Under standard identification and dependence conditions, the resulting intervals are asymptotically valid. Simulations and benchmark datasets show substantial improvement in finite-sample calibration, particularly under MAR labeling and clustered sampling.
Adaptive Active Learning for Regression via Reinforcement Learning
arXiv (Cornell University) · 2026-03-11
preprintOpen accessSenior authorActive learning for regression reduces labeling costs by selecting the most informative samples. Improved Greedy Sampling is a prominent method that balances feature-space diversity and output-space uncertainty using a static, multiplicative rule. We propose Weighted improved Greedy Sampling (WiGS), which replaces this framework with a dynamic, additive criterion. We formulate weight selection as a reinforcement learning problem, enabling an agent to adapt the exploration-investigation balance throughout learning. Experiments on 18 benchmark datasets and a synthetic environment show WiGS outperforms iGS and other baseline methods in both accuracy and labeling efficiency, particularly in domains with irregular data density where the baseline's multiplicative rule ignores high-error samples in dense regions.
Unique Rashomon Sets for Robust Active Learning
ArXiv.org · 2025-03-09
preprintOpen accessSenior authorCollecting labeled data for machine learning models is often expensive and time-consuming. Active learning addresses this challenge by selectively labeling the most informative observations, but when initial labeled data is limited, it becomes difficult to distinguish genuinely informative points from those appearing uncertain primarily due to noise. Ensemble methods like random forests are a powerful approach to quantifying this uncertainty but do so by aggregating all models indiscriminately. This includes poor performing models and redundant models, a problem that worsens in the presence of noisy data. We introduce UNique Rashomon Ensembled Active Learning (UNREAL), which selectively ensembles only distinct models from the Rashomon set, which is the set of nearly optimal models. Restricting ensemble membership to high-performing models with different explanations helps distinguish genuine uncertainty from noise-induced variation. We show that UNREAL achieves faster theoretical convergence rates than traditional active learning approaches and demonstrates empirical improvements of up to 20% in predictive accuracy across five benchmark datasets, while simultaneously enhancing model interpretability.
Recent grants
Compact Bayesian Models of Massive Social Graphs
NSF · $150k · 2016–2019
ATD: Collaborative Research: Algorithms and Data for High-Frequency, Real-Time Anomaly Detection
NSF · $100k · 2017–2020
NIH · $437k · 2018–2021
NIH · $2.3M · 2019–2024
Estimating vital rates in the developing world: A Bayesian process modeling approach
NIH · $611k · 2015–2020
Frequent coauthors
- 45 shared
Arun G. Chandrasekhar
National Bureau of Economic Research
- 34 shared
Samuel J. Clark
Indepth Network
- 34 shared
Zehang Li
- 26 shared
Shane Lubold
University of Washington
- 24 shared
Tian Zheng
Institute of Hematology & Blood Diseases Hospital
- 23 shared
Mengjie Pan
- 21 shared
Emily Breza
- 17 shared
Bailey K. Fosdick
Colorado School of Public Health
Education
- 2011
PhD, Statistics
Columbia University
Awards & honors
- NIH Director’s New Innovator Award (2019)
- Fellow of the American Statistical Association (2023)
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Tyler Mccormick
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup