
Caroline Uhler
VerifiedMassachusetts Institute of Technology · Electrical Engineering & Computer Science
Active 2007–2026
About
Caroline Uhler is the Andrew (1956) and Erna Viterbi Professor of Engineering at MIT, where she is also a Professor of Electrical Engineering and Computer Science (EECS) and a member of the Institute for Data, Systems, and Society (IDSS). She serves as the Director of the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard, where she is a core institute and scientific leadership team member. Her research focuses on artificial intelligence and decision-making, particularly in the context of healthcare and life sciences. She develops techniques for the analysis and synthesis of systems that interact with the external world through perception, communication, and action, while also learning, making decisions, and adapting to changing environments. Her work includes applying AI to understand cell biology, disease mechanisms, and drug development, as well as predicting protein locations within human cells and tracking gene expression changes. She is recognized for her contributions to integrating AI with biological and medical sciences, and she is scheduled to deliver a sectional lecture at the International Congress of Mathematicians in Philadelphia.
Research topics
- Computer science
- Mathematics
- Artificial intelligence
- Biology
- Combinatorics
Selected publications
A structure-informed deep learning framework for modeling TCR-peptide-HLA interactions
bioRxiv (Cold Spring Harbor Laboratory) · 2026-04-02
articleOpen accessThe interaction between T cell receptors (TCRs), peptides, and human leukocyte antigens (HLAs) underlies antigen-specific T cell immunity. Despite substantial advances in peptide-HLA presentation prediction, accurate modeling of coupled TCR-peptide-HLA recognition remains underdeveloped, limiting applications such as TCR and neoepitope prioritization in cancer and antigen identification in autoimmunity. Here we present StriMap, a unified framework for predicting TCR-peptide-HLA interactions by integrating physicochemical, sequence-context, and structural features at recognition interfaces. StriMap achieves state-of-the-art performance with improved generalizability and enables applications in both cancer and autoimmunity. As a case study in ankylosing spondylitis (AS), we screened 13 million peptides derived from 43,241 bacterial proteins and identified candidate molecular mimics that were experimentally validated to activate T cells expressing an AS-associated TCR. Notably, a top validated peptide was enriched in patients with inflammatory bowel disease (IBD), suggesting potential shared microbial triggers between AS and IBD. Overall, StriMap provides a generalizable framework for rational immunotherapy design and for dissecting antigenic drivers of autoimmunity.
Cancer Research · 2026-04-03
articleAbstract Introduction: The development of non-invasive, simple, and accurate methods to predict patient response to cancer therapy remains an open challenge. Proton radiation therapy (PRT) is increasingly used for hard-to-reach tumors or those in sensitive areas. However, it remains more expensive than other radiation therapies and while considered safer than conventional radiation therapy, its short- and long-term side effects are still not well explored. Therefore, developing an early measure for patient response is a critical research direction. Here we sought to test whether chromatin images of peripheral blood mononuclear cells (PBMCs) contain sufficient information to track patients’ trajectories during and after PRT. Methods: We collected blood samples at five timepoints (before, during, at the end of, and twice after PRT) from 150 patients across various cancers including Central Nervous System and Head & Neck cancers, and 50 healthy volunteers. PBMCs were isolated, stained with DAPI to label the DNA, and imaged with a confocal microscope. We applied machine learning methods to single-cell crops of the PBMCs to: 1) classify healthy vs. cancer patients, 2) derive patient-level similarity-to-healthy scores, and 3) predict patient trajectories. To account for the possibility that variation in PBMC proportions might be a key difference between healthy and cancer patients, we adopted a multiple-instance learning (MIL) approach. MIL is a form of weakly-supervised learning that automatically discovers which features and which cells are important within a collection of cells. Results: By comparing chromatin images of PBMCs from cancer patients and healthy volunteers using our MIL framework, we identified cancer-specific alterations in PBMC chromatin architecture induced by tumor-derived signals in the bloodstream. Longitudinal tracking across five time points revealed three distinct patient subgroups. Patients whose PBMC profiles shifted toward greater similarity to healthy volunteers after therapy were less likely to experience disease recurrence. Furthermore, our MIL framework enabled prediction of patients’ likelihood of returning to a healthy state after therapy, based solely on pre-treatment PBMC chromatin images, within the largest cancer type population in our study, Head & Neck cancer. Conclusion: In summary, we demonstrated that simple chromatin images derived from liquid biopsies can serve as a non-invasive, easily obtained, and inexpensive biomarker for monitoring patient trajectories during PRT. This motivates further investigation of the use of PBMC chromatin images in the context of cancer screening and treatment monitoring, and more broadly in other disease contexts where PBMCs have been previously studied as potential biomarkers. Citation Format: Hannah M. Schlüter, Trinadha Rao Sornapudi, Dominic Leiser, Sandra Koller, Zeynep Karavelioglu, Caroline Uhler, Damien Weber, G. V. Shivashankar. Deep learning-based analysis reveals patient-level proton radiation therapy trajectories using single-cell PBMC chromatin images [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 5468.
Learning genetic perturbation effects with variational causal inference
PLoS Computational Biology · 2026-02-02
articleOpen accessSenior authorAdvances in sequencing technologies have enhanced the understanding of gene regulation in cells. In particular, Perturb-seq has enabled high-resolution profiling of the transcriptomic response to genetic perturbations at the single-cell level. This understanding has implications in functional genomics and potentially for identifying therapeutic targets. Various computational models have been developed to predict perturbational effects. While deep learning models excel at interpolating observed perturbational data, they tend to overfit in the lack of enough data and may not generalize well to unseen perturbations. In contrast, mechanistic models, such as linear causal models based on gene regulatory networks, hold greater potential for extrapolation, as they encapsulate regulatory information that can predict responses to unseen perturbations. However, their application has been limited to small studies due to overly simplistic assumptions, making them less effective in handling noisy, large-scale single-cell data. We propose a hybrid approach that combines a mechanistic causal model with variational deep learning, termed Single Cell Causal Variational Autoencoder (SCCVAE). The mechanistic model employs a learned regulatory network to represent perturbational changes as shift interventions that propagate through the learned network. SCCVAE integrates this mechanistic causal model into a variational autoencoder, generating rich, comprehensive transcriptomic responses. Our results indicate that SCCVAE exhibits superior performance over current state-of-the-art baselines for extrapolating to predict unseen perturbational responses. Additionally, for the observed perturbations, the latent space learned by SCCVAE allows for the identification of functional perturbation modules and simulation of single-gene knockdown experiments of varying penetrance, presenting a robust tool for interpreting and interpolating perturbational responses at the single-cell level.
arXiv (Cornell University) · 2026-04-04
articleOpen accessDiffusion models have excelled at generative tasks for both continuous and token-based domains, but their application to discrete ordinal data remains underdeveloped. We present CountsDiff, a diffusion framework designed to natively model distributions on the natural numbers. CountsDiff extends the Blackout diffusion framework by simplifying its formulation through a direct parameterization in terms of a survival probability schedule and an explicit loss weighting. This introduces flexibility through design parameters with direct analogues in existing diffusion modeling frameworks. Beyond this reparameterization, CountsDiff introduces features from modern diffusion models, previously absent in counts-based domains, including continuous-time training, classifier-free guidance, and churn/remasking reverse dynamics that allow non-monotone reverse trajectories. We propose an initial instantiation of CountsDiff and validate it on natural image datasets (CIFAR-10, CelebA), exploring the effects of varying the introduced design parameters in a complex, well-studied, and interpretable data domain. We then highlight biological count assays as a natural use case, evaluating CountsDiff on single-cell RNA-seq imputation in a fetal cell and heart cell atlas. Remarkably, we find that even this simple instantiation matches or surpasses the performance of a state-of-the-art discrete generative model and leading RNA-seq imputation methods, while leaving substantial headroom for further gains through optimized design choices in future work.
Positivity in linear Gaussian structural equation models
Electronic Journal of Statistics · 2026-01-01
articleOpen accessWe study a notion of positivity of Gaussian directed acyclic graphical models corresponding to a non-negativity constraint on the coefficients of the associated structural equation model. We prove that this constraint is equivalent to the distribution being conditionally increasing in sequence (CIS), a well-known subclass of positively associated random variables. These distributions require knowledge of a permutation, a CIS ordering, of the nodes for which the constraint of non-negativity holds. We provide an algorithm and prove in the noise-less setting that a CIS ordering can be recovered efficiently when it exists. We extend this result to the noisy setting and provide assumptions for recovering the CIS orderings. In addition, we provide a characterization of Markov equivalence for CIS DAG models. Further, we show that when a CIS ordering is known, the corresponding class of Gaussians lies in a family of distributions in which maximum likelihood estimation is a convex problem.
bioRxiv (Cold Spring Harbor Laboratory) · 2026-05-22
articleOpen accessSenior authorCorrespondingPerturbations of genes with functional importance in T cells could be used to change the distribution of CD8 T cell states to enhance anti-tumor functions for cancer immunotherapies. We launched a world-wide computational challenge to predict the effects of gene perturbations and to devise objective functions for prioritizing gene perturbations that lead to desired T-cell state distributions. We supported the challenge by generating a single-cell Perturb-seq dataset profiling the effect of knocking out 73 individual expert-defined genes in T cells transferred into a mouse melanoma model. We compared the top algorithms developed by participants, and found that performance was primarily determined by the prior data used for gene feature representation, with perturbational data derived features, proving most effective. Experimental validation of the top 61 genes nominated by the algorithms revealed that perturbation of Ndufv2 and Dimt1 reached the defined objective and biased T cell differentiation toward desired states.
arXiv (Cornell University) · 2026-04-04
preprintOpen accessDiffusion models have excelled at generative tasks for both continuous and token-based domains, but their application to discrete ordinal data remains underdeveloped. We present CountsDiff, a diffusion framework designed to natively model distributions on the natural numbers. CountsDiff extends the Blackout diffusion framework by simplifying its formulation through a direct parameterization in terms of a survival probability schedule and an explicit loss weighting. This introduces flexibility through design parameters with direct analogues in existing diffusion modeling frameworks. Beyond this reparameterization, CountsDiff introduces features from modern diffusion models, previously absent in counts-based domains, including continuous-time training, classifier-free guidance, and churn/remasking reverse dynamics that allow non-monotone reverse trajectories. We propose an initial instantiation of CountsDiff and validate it on natural image datasets (CIFAR-10, CelebA), exploring the effects of varying the introduced design parameters in a complex, well-studied, and interpretable data domain. We then highlight biological count assays as a natural use case, evaluating CountsDiff on single-cell RNA-seq imputation in a fetal cell and heart cell atlas. Remarkably, we find that even this simple instantiation matches or surpasses the performance of a state-of-the-art discrete generative model and leading RNA-seq imputation methods, while leaving substantial headroom for further gains through optimized design choices in future work.
Latent Causal Diffusions for Single-Cell Perturbation Modeling
Europe PMC (PubMed Central) · 2026-01-20
preprintOpen accessSenior authorPerturbation screens hold the potential to systematically map regulatory processes at single-cell resolution, yet modeling and predicting transcriptome-wide responses to perturbations remains a major computational challenge. Existing methods often underperform simple baselines, fail to disentangle measurement noise from biological signal, and provide limited insight into the causal structure governing cellular responses. Here, we present the latent causal diffusion (LCD), a generative model that frames single-cell gene expression as a stationary diffusion process observed under measurement noise. LCD outperforms established approaches in predicting the distributional shifts of unseen perturbation combinations in single-cell RNA-sequencing screens while simultaneously learning a mechanistic dynamical system of gene regulation. To interpret these learned dynamics, we develop an approach we call causal linearization via perturbation responses (CLIPR), which yields an approximation of the direct causal effects between all genes modeled by the diffusion. CLIPR provably identifies causal effects under a linear drift assumption and recovers causal structure in both simulated systems and a genome-wide perturbation screen, where it clusters genes into coherent functional modules and resolves causal relationships that standard differential expression analysis cannot. The LCD-CLIPR framework bridges generative modeling with causal inference to predict unseen perturbation effects and map the underlying regulatory mechanisms of the transcriptome.
Partially shared multi-modal embedding learns holistic representation of cell state
Repository for Publications and Research Data (ETH Zurich) · 2026-03-01
otherOpen accessSenior authorCurrent technologies enable the simultaneous measurement of diverse data types at the single-cell level. However, data are often processed separately, or integrated via representation learning methods that obscure the contributions of each data modality. Here we present a computational framework that automatically learns partial information sharing between multiple modalities by using an Autoencoder with a Partially Overlapping Latent space learned through Latent Optimization (APOLLO). We tested APOLLO on simulated data, and on four applications involving paired single-cell data: SHARE-seq (scRNA-seq and scATAC-seq), CITE-seq (scRNA-seq and protein abundance), and two multiplexed imaging datasets. APOLLO enables the prediction of missing modalities, such as unmeasured protein stains, and allows disentangling which modality or cellular compartment is linked with a specific phenotype, such as the variability in protein localization observed across single cells. Overall, APOLLO efficiently integrates diverse data modalities and, by retaining and distinguishing between shared and modality-specific information, provides a more interpretable and holistic view of cell state.
Nature Communications · 2026-03-12
articleOpen accessSenior authorT cell states are prognostic in different cancer types. Recent technologies enable joint profiling of T cell RNA and T cell receptor (TCR) sequences at single-cell resolution. Here we present the TCR-RNA Integrating Model (TRIM), a multi-modal variational autoencoder framework that integrates RNA-TCR data and predicts T cell clonality and transcriptional states. TRIM learns a shared representation of the data conditioned on patient, tissue source, and treatment timepoint. We applied TRIM to three independent datasets that included T cells collected before and after checkpoint inhibitor treatment, sourced either from blood and tumor biopsies in patients with head and neck squamous cell carcinoma and colorectal cancer, or from tumor and adjacent tissue in a pan-cancer dataset. In all settings, TRIM accurately predicted intra-tumor T cell clonal expansion and transcriptional status based on T cells from blood or normal tissue before treatment, demonstrating its utility in modeling multimodal T cell data and predicting T cell response to treatment and disease progression. Recent technologies enable joint profiling of T cell RNA and T cell receptor (TCR) sequences at single-cell resolution. The authors here develop a TCR-RNA Integrating Model (TRIM) to predict intra tumor T cell signature post checkpoint inhibitor treatment by analyzing T cells from blood or tissues before treatment.
Recent grants
CAREER: Gaussian Graphical Models: Theory, Computation, and Applications
NSF · $400k · 2017–2023
Frequent coauthors
- 72 shared
G. V. Shivashankar
Paul Scherrer Institute
- 68 shared
Adityanarayanan Radhakrishnan
- 52 shared
Bernd Sturmfels
- 51 shared
Piotr Zwiernik
- 49 shared
Chandler Squires
- 30 shared
Anastasiya Belyaeva
- 28 shared
Karren Yang
Apple (United States)
- 20 shared
Yuhao Wang
Nanjing Normal University
Education
- 2011
MOT, Management of Technology
University of California Berkeley
- 2011
PhD, Statistics
University of California Berkeley
- 2006
MSc, Mathematics
University of Zurich
Awards & honors
- Next year, she’ll deliver a sectional lecture to the Interna…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Caroline Uhler
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup