
Roni Rosenfeld
· ProfessorVerifiedCarnegie Mellon University · Machine Learning Department
Active 1959–2026
About
Roni Rosenfeld is a University Professor of machine learning, language technologies, computer science, and computational biology in the School of Computer Science at Carnegie Mellon University. He has taught machine learning and statistical language modeling to thousands of undergraduate and graduate students since 1997, and has mentored post-doctoral students and advised numerous PhD, Masters, and undergraduate students. His research interests include tracking and forecasting epidemics, with a focus on epidemic forecasting technology development through the Delphi research group, which he co-founded and co-leads. The group has been recognized as a National Center for Epidemic Forecasting by the U.S. CDC. Rosenfeld's previous work includes statistical language modeling, speech recognition, human-machine speech interfaces, and the application of speech and language technologies to aid international development. He has published approximately 150 scientific articles across various fields and has received multiple awards, including the Spira Teaching Excellence Award in 2017 and the Allen Newell Medal for Research Excellence in 1992 and 2022. His educational background includes a PhD and MSc in computer science from Carnegie Mellon University and a BSc in mathematics and physics from Tel Aviv University.
Research topics
- Computer Science
- Medicine
- Psychology
- Environmental health
- Business
- Artificial Intelligence
- Sociology
- Political Science
- Data science
- Internet privacy
- Virology
- Engineering
- Operations research
- Actuarial science
- Public relations
- World Wide Web
- Geography
- Nursing
Selected publications
Reducing Alert Fatigue Through AI Ranking: A Deployed Public Health Data Monitoring System
Proceedings of the AAAI Conference on Artificial Intelligence · 2026-03-14
articleOpen accessPublic health experts need scalable methods to monitor large volumes of health data (e.g., human-reported cases, hospitalizations, deaths). These methods must identify individual data points that may indicate significant events, such as outbreaks, or reveal data quality issues. Identifying, triaging, and analyzing these data points in real-time is critical for preventing downstream errors in forecasting or policy. Traditional alert-based data monitoring systems, used for decades in practice, fail to identify relevant data events for several reasons. For example, these systems may not output real-time results from large data volumes, or they may return tens of thousands of unhelpful alerts. We introduce a human-in-the-loop AI system for public health data monitoring that uses a ranking-based AI anomaly detection method. This system was developed through a multi-year interdisciplinary collaboration with participatory design from researchers, engineers, and public health data experts. From this process, we identified system goals, such as user control and efficiency and designed a system that balances these goals. This system has since been deployed at a national public health organization and analyzes up to 5 million data points daily. A three-month longitudinal deployment evaluation revealed a significant improvement in system goals, including a 54x increase in data reviewer efficiency and increased engagement compared to traditional alert-based methods.
Reducing Alert Fatigue Through AI Ranking: A Deployed Public Health Data Monitoring System
Open MIND · 2026-01-07
otherOpen accessPublic health experts need scalable methods to monitor large volumes of health data (e.g., human-reported cases, hospitalizations, deaths). These methods must identify individual data points that may indicate significant events, such as outbreaks, or reveal data quality issues. Identifying, triaging, and analyzing these data points in real-time is critical for preventing downstream errors in forecasting or policy. Traditional alert-based data monitoring systems, used for decades in practice, fail to identify relevant data events for several reasons. For example, these systems may not output real-time results from large data volumes, or they may return tens of thousands of unhelpful alerts. We introduce a human-in-the-loop AI system for public health data monitoring that uses a ranking-based AI anomaly detection method. This system was developed through a multi-year interdisciplinary collaboration with participatory design from researchers, engineers, and public health data experts. From this process, we identified system goals, such as user control and efficiency and designed a system that balances these goals. This system has since been deployed at a national public health organization and analyzes up to 5 million data points daily. A three-month longitudinal deployment evaluation revealed a significant improvement in system goals, including a 54x increase in data reviewer efficiency and increased engagement compared to traditional alert-based methods.
1508 - Reducing Alert Fatigue Through AI Ranking: A Deployed Public Health Data Monitoring System
Underline Science Inc. · 2026-01-07
otherOpen accessPublic health experts need scalable methods to monitor large volumes of health data (e.g., human-reported cases, hospitalizations, deaths). These methods must identify individual data points that may indicate significant events, such as outbreaks, or reveal data quality issues. Identifying, triaging, and analyzing these data points in real-time is critical for preventing downstream errors in forecasting or policy. Traditional alert-based data monitoring systems, used for decades in practice, fail to identify relevant data events for several reasons. For example, these systems may not output real-time results from large data volumes, or they may return tens of thousands of unhelpful alerts. We introduce a human-in-the-loop AI system for public health data monitoring that uses a ranking-based AI anomaly detection method. This system was developed through a multi-year interdisciplinary collaboration with participatory design from researchers, engineers, and public health data experts. From this process, we identified system goals, such as user control and efficiency and designed a system that balances these goals. This system has since been deployed at a national public health organization and analyzes up to 5 million data points daily. A three-month longitudinal deployment evaluation revealed a significant improvement in system goals, including a 54x increase in data reviewer efficiency and increased engagement compared to traditional alert-based methods.
Public Health · 2025-12-19
articleOpen accessFederated epidemic surveillance
PLoS Computational Biology · 2025-04-08 · 4 citations
articleOpen accessEpidemic surveillance is a challenging task, especially when crucial data is fragmented across institutions and data custodians are unable or unwilling to share it. This study aims to explore the feasibility of a simple federated surveillance approach. We conduct hypothesis tests on count data behind each custodian's firewall and then combine p-values from these tests using techniques from meta-analysis. We propose a hypothesis testing framework to identify surges in epidemic-related data streams and conduct experiments on real and semi-synthetic data to assess the power of different p-value combination methods to detect surges without needing to combine or share the underlying counts. Our findings show that relatively simple combination methods achieve a high degree of fidelity and suggest that infectious disease outbreaks can be detected without needing to share or even aggregate data across institutions.
Real-time Forecasting of Data Revisions in Epidemic Surveillance Streams
medRxiv · 2025-05-12
preprintOpen accessSenior authorAbstract Epidemic data streams undergo frequent revisions due to reporting delays (“backfill”) and other factors. Relying on tentative surveillance values can seriously degrade the quality of situational awareness, forecasting accuracy and decision-making. We introduce Delphi Revision Forecast (Delphi-RF), a real-time data revision forecasting framework using nonparametric quantile regression, applicable to both counts and proportions (fractions) in public health reporting. By incorporating all available revisions up to a given estimation date, Delphi-RF models revision dynamics and generates distributional forecasts of finalized surveillance values. Applied to daily COVID-19 data (insurance claims, antigen tests, confirmed cases) and weekly dengue and influenza-like illness (ILI) case counts, Delphi-RF delivers accurate revision forecasts, particularly in early reporting stages. In addition, it improves computational efficiency by more than 10-100x compared to existing methods, making it a scalable solution for real-time public health surveillance. Author summary Accurate and reliable forecasts of infectious disease epidemics, such as COVID-19, are essential but challenging. The presence of data revisions in public health data streams can introduce significant biases in both predictors and responses, leading to suboptimal situational awareness, preparedness, and downstream countermeasure design. To address this issue, we propose a modeling framework that leverages historical revision patterns to generate distributional forecasts of finalized surveillance values. Applicable to both count-type and fraction-type data across various temporal resolutions and epidemic surveillance data streams, our approach ensures real-time accuracy, even with only early revisions available. Moreover, our method achieves competitive or superior forecast accuracy compared to existing methods, while also demonstrating a more than 10-100x improvement in computational efficiency.
An AI-Based Public Health Data Monitoring System
ArXiv.org · 2025-06-04
preprintOpen accessPublic health experts need scalable approaches to monitor large volumes of health data (e.g., cases, hospitalizations, deaths) for outbreaks or data quality issues. Traditional alert-based monitoring systems struggle with modern public health data monitoring systems for several reasons, including that alerting thresholds need to be constantly reset and the data volumes may cause application lag. Instead, we propose a ranking-based monitoring paradigm that leverages new AI anomaly detection methods. Through a multi-year interdisciplinary collaboration, the resulting system has been deployed at a national organization to monitor up to 5,000,000 data points daily. A three-month longitudinal deployed evaluation revealed a significant improvement in monitoring objectives, with a 54x increase in reviewer speed efficiency compared to traditional alert-based methods. This work highlights the potential of human-centered AI to transform public health decision-making.
Real-time forecasting of data revisions in epidemic surveillance streams
PLoS Computational Biology · 2025-11-20
articleOpen accessSenior authorCorrespondingEpidemic data streams undergo frequent revisions due to reporting delays ("backfill") and other factors. Relying on tentative surveillance values can seriously degrade the quality of situational awareness, forecasting accuracy and decision-making. We introduce Delphi Revision Forecast (Delphi-RF), a real-time data revision forecasting framework using nonparametric quantile regression, applicable to both counts and proportions (fractions) in public health reporting. By incorporating all available revisions up to a given estimation date, Delphi-RF models revision dynamics and generates distributional forecasts of finalized surveillance values. Applied to daily COVID-19 data (insurance claims, antigen tests, confirmed cases) and weekly dengue and influenza-like illness (ILI) case counts, Delphi-RF delivers accurate revision forecasts, particularly in early reporting stages. In addition, it improves computational efficiency by more than 10-100x compared to existing methods, making it a scalable solution for real-time public health surveillance.
medRxiv · 2024-10-25
preprintOpen accessAbstract Currently, there are few standards for what essential information about an infectious disease outbreak should be reported to the public and when. The content and timeliness of public reporting (e.g. situation reports) is at the discretion of the jurisdiction overseeing the outbreak response, resulting in a substantial heterogeneity in available information. To address this problem, we undertook a consensus process to develop recommendations for what epidemiological information public health authorities should report to the public during an outbreak, including the administrative level and frequency of reporting. We first assembled a steering committee of nine experts representing federal public health, state public health, academia, and international partners to develop a candidate list of reporting items. We then invited 45 experts, 35 of whom agreed to participate in a Delphi panel. Of those, 25 participated in voting in the first round, 25 participated in voting in the second round, and 25 participated in voting in the third round, demonstrating consistent engagement in the consensus-building process. The final stage of the Delphi process consisted of a hybrid consensus meeting to finalize the voting items. This resulted in a final list of nine reporting items representing the minimum set of information to include in publicly available situation reports: Numbers of new confirmed cases, new hospital admissions, new deaths, cumulative confirmed cases, cumulative hospital admissions, and cumulative deaths, each reported weekly and at Administrative level 1 (typically state or province), and stratified by sex, age group, and race/ethnicity. This minimum reporting standard creates a strong framework and guidance for uniform sharing of outbreak information and promotes consistency of data between jurisdictions to enable prompt and effective response.
UNC Libraries · 2024-11-06
articleOpen access
Frequent coauthors
- 20 shared
Ryan J. Tibshirani
- 18 shared
Monroe E. Wall
- 16 shared
Logan Brooks
Carnegie Mellon University
- 12 shared
Matthew Biggerstaff
- 12 shared
Nicholas G Reich
- 12 shared
Jeffrey Shaman
- 12 shared
David Farrow
University of Toledo
- 11 shared
Michael A. Johansson
Centers for Disease Control and Prevention
Labs
Not provided
Education
- 1985
B.S., Mathematics and Physics
Tel Aviv University
- 1991
M.S., Computer Science
Carnegie Mellon University
- 1994
Ph.D., Computer Science
Carnegie Mellon University
Awards & honors
- Spira Teaching Excellence Award (2017)
- Allen Newell Medal for Research Excellence (1992)
- Allen Newell Medal for Research Excellence (2022)
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Roni Rosenfeld
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup