Karianne Bergen
· Assistant Professor of Earth, Environmental, and Planetary Sciences and Data Science, Assistant Professor of Computer ScienceVerifiedBrown University · Computer Science
Active 2010–2026
About
Karianne Bergen is an Assistant Professor of Data Science and Earth, Environmental & Planetary Sciences as well as an Assistant Professor of Computer Science at Brown University. Her research interests lie in scientific machine learning (SciML), with a focus on developing machine learning methods for pattern recognition and discovery in large, noisy sensor datasets. Her work has applications in earthquake seismology and biodefense. Currently, her research group concentrates on machine learning-based surrogate models and data enhancement techniques such as downscaling, particularly applied to Earth and Climate science. Additionally, her group explores explainable AI (XAI) for scientific data and the development of scientific foundation models. Dr. Bergen completed her postdoctoral training at Harvard University as a Data Science Initiative Postdoctoral Fellow in Computer Science. She earned her Ph.D. and M.S. in Computational and Mathematical Engineering from Stanford University and holds a B.Sc. in Applied Mathematics from Brown University. She has professional experience as a data scientist at MIT Lincoln Laboratory. Beyond her research, she is passionate about data science education and workforce development, especially supporting women interested in data science careers. She has taught numerous courses and workshops aimed at making data science accessible to students and professionals from diverse disciplinary backgrounds.
Research topics
- Data Mining
- Computer Science
- Data science
- Artificial Intelligence
- Cognitive science
- Engineering
- Geology
- Seismology
Selected publications
Rewiring climate modeling with machine learning emulators
Communications Earth & Environment · 2026-01-30
articleOpen accessSenior authorEarth system models, or simulators, are foundational for projecting climate change impacts, but their computational expense limits the number and diversity of simulations available. Machine learning-based emulators, statistical surrogates trained on simulator outputs, can replicate components of climate models at orders-of-magnitude lower cost, enabling ensembles and interpolation across scenarios. We argue that the next phase of climate modeling hinges on closer collaboration between simulator and emulator communities. We outline three priorities: (1) co-design of simulators and emulators so that experimental design, diagnostics, and data products support training, evaluation, and targeted simulation; (2) shared, machine learning-ready benchmarks with data partitions and metrics that emphasize physical fidelity; and (3) treating emulators as reliable software components with interfaces, documentation, and deployment pathways for sensitivity analyses, scenario exploration, and uncertainty decomposition. This perspective envisions emulators not as statistical shortcuts, but core tools that accelerate the pace of climate science. This Perspective argues that machine learning emulators could transform climate modeling by co-designing with simulators, aligning goals, data, and diagnostics, and building shared infrastructure and robust software to accelerate science.
Data & Code for: "Emulator-expanded projections reveal structure in Antarctic sea level uncertainty"
Zenodo (CERN European Organization for Nuclear Research) · 2026-03-31
otherOpen accessSenior authorData and code for the submission of the manuscript: "Emulator-expanded projections reveal structure in Antarctic sea level uncertainty".
Open MIND · 2026-01-30
preprintSenior authorExplainable AI (XAI) is essential for understanding machine learning (ML) decision-making and ensuring model trustworthiness in scientific applications. Prototype-based XAI methods offer an intrinsically interpretable alternative to post-hoc approaches which often yield inconsistent explanations. Prototype-based XAI methods make predictions based on the similarity between inputs and learned prototypes that represent typical characteristics of target classes. However, existing prototype-based models are primarily designed for standard RGB image data and are not optimized for the distinct, variable-specific channels commonly found in geoscientific image and raster datasets. In this study, we develop a prototype-based XAI approach tailored for multi-channel geospatial data, where each channel represents a distinct physical environmental variable or spectral channel. Our approach enables the model to identify separate, channel-specific prototypical characteristics sourced from multiple distinct training examples that inform how these features individually and in combination influence model prediction while achieving comparable performance to standard neural networks. We demonstrate this method through two geoscientific case studies: (1) classification of Madden Julian Oscillation phases using multi-variable climate data and (2) land-use classification from multispectral satellite imagery. This approach produces both local (instance-level) and global (model-level) explanations for providing insights into feature-relevance across channels. By explicitly incorporating channel-prototypes into the prediction process, we discuss how this approach enhances the transparency and trustworthiness of ML models for geoscientific learning tasks.
Data & Code for: "Emulator-expanded projections reveal structure in Antarctic sea level uncertainty"
Zenodo (CERN European Organization for Nuclear Research) · 2026-04-21
otherOpen accessSenior authorData and code for the submission of the manuscript: "Emulator-expanded projections reveal structure in Antarctic sea level uncertainty".
Data & Code for: "Emulator-expanded projections reveal structure in Antarctic sea level uncertainty"
Zenodo (CERN European Organization for Nuclear Research) · 2026-04-21
otherOpen accessSenior authorData and code for the submission of the manuscript: "Emulator-expanded projections reveal structure in Antarctic sea level uncertainty".
arXiv (Cornell University) · 2026-01-30
articleOpen accessSenior authorExplainable AI (XAI) is essential for understanding machine learning (ML) decision-making and ensuring model trustworthiness in scientific applications. Prototype-based XAI methods offer an intrinsically interpretable alternative to post-hoc approaches which often yield inconsistent explanations. Prototype-based XAI methods make predictions based on the similarity between inputs and learned prototypes that represent typical characteristics of target classes. However, existing prototype-based models are primarily designed for standard RGB image data and are not optimized for the distinct, variable-specific channels commonly found in geoscientific image and raster datasets. In this study, we develop a prototype-based XAI approach tailored for multi-channel geospatial data, where each channel represents a distinct physical environmental variable or spectral channel. Our approach enables the model to identify separate, channel-specific prototypical characteristics sourced from multiple distinct training examples that inform how these features individually and in combination influence model prediction while achieving comparable performance to standard neural networks. We demonstrate this method through two geoscientific case studies: (1) classification of Madden Julian Oscillation phases using multi-variable climate data and (2) land-use classification from multispectral satellite imagery. This approach produces both local (instance-level) and global (model-level) explanations for providing insights into feature-relevance across channels. By explicitly incorporating channel-prototypes into the prediction process, we discuss how this approach enhances the transparency and trustworthiness of ML models for geoscientific learning tasks.
2025-03-20 · 1 citations
preprintOpen accessSenior authorAbstract. Ice sheets are the primary contributors to global sea level rise, yet projecting their future contributions remains challenging due to the complex, nonlinear processes governing their dynamics and uncertainties in future climate scenarios. This study introduces ISEFlow (v1.0), a neural network-based emulator of the ISMIP6 ice sheet model ensemble designed to accurately and efficiently predict sea level contributions from both ice sheets while quantifying the sources of projection uncertainty. By integrating a normalizing flow architecture to capture data coverage uncertainty and a deep ensemble of LSTM models to assess emulator uncertainty, ISEFlow separates uncertainties arising from training data from those inherent to the emulator. Compared to existing emulators such as Emulandice and LARMIP, ISEFlow achieves substantially lower mean squared error and improved distribution approximation while maintaining faster inference times. This study investigates the drivers of increased accuracy and emission scenario distinction and finds that the inclusion of all available climate forcings, ice sheet model characteristics, and higher spatial resolution significantly enhances predictive accuracy and the ability to capture the effects of varying emissions scenarios compared to other emulators. We include a detailed analysis of importance of input variables using Shapley Additive Explanations, and highlight both the climate forcings and model characteristics that have the largest impact on sea level projections. ISEFlow offers a computationally efficient tool for generating accurate sea level projections, supporting climate risk assessments and informing policy decisions.
HybridFlow: Quantification of Aleatoric and Epistemic Uncertainty with a Single Hybrid Model
ArXiv.org · 2025-10-06
preprintOpen accessSenior authorUncertainty quantification is critical for ensuring robustness in high-stakes machine learning applications. We introduce HybridFlow, a modular hybrid architecture that unifies the modeling of aleatoric and epistemic uncertainty by combining a Conditional Masked Autoregressive normalizing flow for estimating aleatoric uncertainty with a flexible probabilistic predictor for epistemic uncertainty. The framework supports integration with any probabilistic model class, allowing users to easily adapt HybridFlow to existing architectures without sacrificing predictive performance. HybridFlow improves upon previous uncertainty quantification frameworks across a range of regression tasks, such as depth estimation, a collection of regression benchmarks, and a scientific case study of ice sheet emulation. We also provide empirical results of the quantified uncertainty, showing that the uncertainty quantified by HybridFlow is calibrated and better aligns with model error than existing methods for quantifying aleatoric and epistemic uncertainty. HybridFlow addresses a key challenge in Bayesian deep learning, unifying aleatoric and epistemic uncertainty modeling in a single robust framework.
2025-12-01
articleOpen accessSenior authorCorrespondingAbstract. Ice sheets are the primary contributors to global sea level rise, yet projecting their future contributions remains challenging due to the complex, nonlinear processes governing their dynamics and uncertainties in future climate scenarios. This study introduces ISEFlow, a neural network-based emulator of the ISMIP6 ice sheet model ensemble designed to accurately and efficiently predict sea level contributions from both ice sheets while quantifying the sources of projection uncertainty. By integrating a normalizing flow architecture to capture data coverage uncertainty and a deep ensemble of LSTM models to assess emulator uncertainty, ISEFlow separates uncertainties arising from training data from those inherent to the emulator. Compared to existing emulators such as Emulandice and LARMIP, ISEFlow achieves substantially lower mean squared error and improved distribution approximation while maintaining faster inference times. This study investigates the drivers of increased accuracy and emission scenario distinction and finds that the inclusion of all available climate forcings, ice sheet model characteristics, and higher spatial resolution significantly enhances predictive accuracy and the ability to capture the effects of varying emissions scenarios compared to other emulators. We include a detailed analysis of importance of input variables using Shapley Additive Explanations, and highlight both the climate forcings and model characteristics that have the largest impact on sea level projections. ISEFlow offers a computationally efficient tool for generating accurate sea level projections, supporting climate risk assessments and informing policy decisions.
Zenodo (CERN European Organization for Nuclear Research) · 2025-02-01
otherOpen accessSenior authorCode and data from ISE: Ice Sheet Emulator repository that was used to compute all data, models, and figures for the paper "A Variational LSTM Emulator of Sea Level Contribution from the Antarctic Ice Sheet".
Frequent coauthors
- 20 shared
Gregory C. Beroza
Stanford University
- 12 shared
Peter Van Katwyk
Brown University
- 12 shared
Jerome Braun
- 12 shared
Timothy J. Dasey
MIT Lincoln Laboratory
- 11 shared
Clara E. Yoon
United States Geological Survey
- 8 shared
Yan Glina
MIT Lincoln Laboratory
- 8 shared
Baylor Fox‐Kemper
Providence College
- 8 shared
Edward C. Wack
Labs
Education
- 2013
Ph.D., Geophysics
Massachusetts Institute of Technology
- 2009
M.S., Geophysics
Massachusetts Institute of Technology
- 2007
B.S., Earth and Planetary Science
University of California, Berkeley
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Karianne Bergen
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup