Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
David Gifford

David Gifford

· Professor

Massachusetts Institute of Technology · Biological Engineering

Active 1977–2025

h-index81
Citations58.8k
Papers30043 last 5y
Funding$28.0M1 active
See your match with David Gifford — sign in to PhdFit.Sign in

About

David Gifford, PhD, is a Professor of Electrical Engineering and Computer Science, as well as a Professor of Biological Engineering at MIT. He received his BS from MIT in 1976 and his PhD from Stanford University in 1981. Since joining the MIT faculty in 1982, he has developed new machine learning techniques and algorithms to model transcriptional regulatory networks that control gene expression programs in living cells. His research group focuses on creating combined computational and experimental approaches to discover novel biology and human therapeutics, utilizing interpretable computational models trained and validated with experimental evidence. Gifford's work involves applying these models to problems in experiment design, developmental biology, gene regulation, immunology, genomics, and human therapeutics. His group evaluates models and uncovers new biology through multiplexed high-throughput experimental studies involving populations of cells and single cells. A key challenge addressed by his research is the incomplete knowledge of biological systems, leading to model uncertainty. His team actively develops uncertainty metrics for models to guide experiment design and improve model accuracy. His computational approaches incorporate large-scale linear and non-linear models, Bayesian methods, and deep learning. His current biological focus areas include motor neuron development, single-cell perturbation studies, chromatin accessibility regulation, the regulatory genome, antibody design, and peptide presentation by MHC proteins.

Research topics

  • Computational biology
  • Genetics
  • Biology
  • Computer Science
  • Artificial Intelligence
  • Evolutionary biology
  • Medicine
  • Virology

Selected publications

  • Deep mapping of the TCR-antigen interface using pMHC-pseudotyped viruses and yeast display

    bioRxiv (Cold Spring Harbor Laboratory) · 2025-08-27 · 1 citations

    preprintOpen access

    T cell receptor (TCR) specificity is central to the efficacy of T cell therapies, yet scalable methods to map how TCR sequences shape antigen recognition remain limited. To address this, we introduce VelociRAPTR, a library-on-library approach that combines yeast-displayed TCR libraries with pMHC-displaying virus-like particles (pMHC-VLPs) to rapidly screen millions of TCR-antigen interactions. We show that pMHC-VLPs efficiently bind TCRs on yeast and generate equivalent data to recombinantly produced pMHC protein. We then apply VelociRAPTR to screen 47 million variants of the A6 and 868 TCRs against 92 pMHCs simultaneously, mutating both the CDR3 loops and cognate peptides. The resulting CDR3-pMHC maps reveal biased recognition patterns, where mutations to CDR3 loops can selectively constrain or broaden specificity to peptide analogs. These insights provide a foundation for engineering TCRs with defined pMHC binding profiles and improving models that predict TCR-antigen interactions, including the prediction of off-target recognition. By coupling the scale of yeast display with the modularity of VLPs, VelociRAPTR offers a generalizable strategy for generating deep, high-throughput protein-protein interaction data.

  • CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods

    Genome biology · 2024-02-22 · 56 citations

    letterOpen access

    BACKGROUND: The Critical Assessment of Genome Interpretation (CAGI) aims to advance the state-of-the-art for computational prediction of genetic variant impact, particularly where relevant to disease. The five complete editions of the CAGI community experiment comprised 50 challenges, in which participants made blind predictions of phenotypes from genetic data, and these were evaluated by independent assessors. RESULTS: Performance was particularly strong for clinical pathogenic variants, including some difficult-to-diagnose cases, and extends to interpretation of cancer-related variants. Missense variant interpretation methods were able to estimate biochemical effects with increasing accuracy. Assessment of methods for regulatory variants and complex trait disease risk was less definitive and indicates performance potentially suitable for auxiliary use in the clinic. CONCLUSIONS: Results show that while current methods are imperfect, they have major utility for research and clinical applications. Emerging methods and increasingly large, robust datasets for training and assessment promise further progress ahead.

  • Training Data Attribution for Diffusion Models

    arXiv (Cornell University) · 2023-06-03 · 2 citations

    preprintOpen accessSenior author

    Diffusion models have become increasingly popular for synthesizing high-quality samples based on training datasets. However, given the oftentimes enormous sizes of the training datasets, it is difficult to assess how training data impact the samples produced by a trained diffusion model. The difficulty of relating diffusion model inputs and outputs poses significant challenges to model explainability and training data attribution. Here we propose a novel solution that reveals how training data influence the output of diffusion models through the use of ensembles. In our approach individual models in an encoded ensemble are trained on carefully engineered splits of the overall training data to permit the identification of influential training examples. The resulting model ensembles enable efficient ablation of training data influence, allowing us to assess the impact of training data on model outputs. We demonstrate the viability of these ensembles as generative models and the validity of our approach to assessing influence.

  • A pan-variant mRNA-LNP T cell vaccine protects HLA transgenic mice from mortality after infection with SARS-CoV-2 Beta

    Frontiers in Immunology · 2023-03-08 · 9 citations

    articleOpen accessSenior authorCorresponding

    Licensed COVID-19 vaccines ameliorate viral infection by inducing production of neutralizing antibodies that bind the SARS-CoV-2 Spike protein and inhibit viral cellular entry. However, the clinical effectiveness of these vaccines is transitory as viral variants escape antibody neutralization. Effective vaccines that solely rely upon a T cell response to combat SARS-CoV-2 infection could be transformational because they can utilize highly conserved short pan-variant peptide epitopes, but a mRNA-LNP T cell vaccine has not been shown to provide effective anti-SARS-CoV-2 prophylaxis. Here we show a mRNA-LNP vaccine (MIT-T-COVID) based on highly conserved short peptide epitopes activates CD8 + and CD4 + T cell responses that attenuate morbidity and prevent mortality in HLA-A*02:01 transgenic mice infected with SARS-CoV-2 Beta (B.1.351). We found CD8 + T cells in mice immunized with MIT-T-COVID vaccine significantly increased from 1.1% to 24.0% of total pulmonary nucleated cells prior to and at 7 days post infection (dpi), respectively, indicating dynamic recruitment of circulating specific T cells into the infected lungs. Mice immunized with MIT-T-COVID had 2.8 (2 dpi) and 3.3 (7 dpi) times more lung infiltrating CD8 + T cells than unimmunized mice. Mice immunized with MIT-T-COVID had 17.4 times more lung infiltrating CD4 + T cells than unimmunized mice (7 dpi). The undetectable specific antibody response in MIT-T-COVID-immunized mice demonstrates specific T cell responses alone can effectively attenuate the pathogenesis of SARS-CoV-2 infection. Our results suggest further study is merited for pan-variant T cell vaccines, including for individuals that cannot produce neutralizing antibodies or to help mitigate Long COVID.

  • Systematic elucidation of genetic mechanisms underlying cholesterol uptake

    bioRxiv (Cold Spring Harbor Laboratory) · 2023-01-10 · 2 citations

    preprintOpen access

    Summary Genetic variation contributes greatly to LDL cholesterol (LDL-C) levels and coronary artery disease risk. By combining analysis of rare coding variants from the UK Biobank and genome-scale CRISPR-Cas9 knockout and activation screening, we have substantially improved the identification of genes whose disruption alters serum LDL-C levels. We identify 21 genes in which rare coding variants significantly alter LDL-C levels at least partially through altered LDL-C uptake. We use co-essentiality-based gene module analysis to show that dysfunction of the RAB10 vesicle transport pathway leads to hypercholesterolemia in humans and mice by impairing surface LDL receptor levels. Further, we demonstrate that loss of function of OTX2 leads to robust reduction in serum LDL-C levels in mice and humans by increasing cellular LDL-C uptake. Altogether, we present an integrated approach that improves our understanding of genetic regulators of LDL-C levels and provides a roadmap for further efforts to dissect complex human disease genetics.

  • Constrained Submodular Optimization for Vaccine Design

    Proceedings of the AAAI Conference on Artificial Intelligence · 2023-06-26 · 2 citations

    articleOpen accessSenior author

    Advances in machine learning have enabled the prediction of immune system responses to prophylactic and therapeutic vaccines. However, the engineering task of designing vaccines remains a challenge. In particular, the genetic variability of the human immune system makes it difficult to design peptide vaccines that provide widespread immunity in vaccinated populations. We introduce a framework for evaluating and designing peptide vaccines that uses probabilistic machine learning models, and demonstrate its ability to produce designs for a SARS-CoV-2 vaccine that outperform previous designs. We provide a theoretical analysis of the approximability, scalability, and complexity of our framework.

  • Systematic elucidation of genetic mechanisms underlying cholesterol uptake

    Cell Genomics · 2023-04-21 · 16 citations

    articleOpen access

    Genetic variation contributes greatly to LDL cholesterol (LDL-C) levels and coronary artery disease risk. By combining analysis of rare coding variants from the UK Biobank and genome-scale CRISPR-Cas9 knockout and activation screening, we substantially improve the identification of genes whose disruption alters serum LDL-C levels. We identify 21 genes in which rare coding variants significantly alter LDL-C levels at least partially through altered LDL-C uptake. We use co-essentiality-based gene module analysis to show that dysfunction of the RAB10 vesicle transport pathway leads to hypercholesterolemia in humans and mice by impairing surface LDL receptor levels. Further, we demonstrate that loss of function of OTX2 leads to robust reduction in serum LDL-C levels in mice and humans by increasing cellular LDL-C uptake. Altogether, we present an integrated approach that improves our understanding of the genetic regulators of LDL-C levels and provides a roadmap for further efforts to dissect complex human disease genetics.

  • Author response: A high-throughput yeast display approach to profile pathogen proteomes for MHC-II binding

    2022-07-03 · 1 citations

    peer-reviewOpen access

    Yeast surface-displayed libraries, when coupled with pooled oligonucleotide synthesis and next-generation sequencing, can be used as a platform to assess binding of whole viral proteomes to class II major histocompatibility complex proteins.

  • A high-throughput yeast display approach to profile pathogen proteomes for MHC-II binding

    bioRxiv (Cold Spring Harbor Laboratory) · 2022-02-24 · 1 citations

    preprintOpen access

    Abstract T cells play a critical role in the adaptive immune response, recognizing peptide antigens presented on the cell surface by Major Histocompatibility Complex (MHC) proteins. While assessing peptides for MHC binding is an important component of probing these interactions, traditional assays for testing peptides of interest for MHC binding are limited in throughput. Here we present a yeast display-based platform for assessing the binding of tens of thousands of user-defined peptides in a high throughput manner. We apply this approach to assess a tiled library covering the SARS-CoV-2 proteome and four dengue virus serotypes for binding to human class II MHCs, including HLA-DR401, -DR402, and -DR404. This approach identifies binders missed by computational prediction, highlighting the potential for systemic computational errors given even state-of-the-art training data, and underlines design considerations for epitope identification experiments. This platform serves as a framework for examining relationships between viral conservation and MHC binding, and can be used to identify potentially high-interest peptide binders from viral proteins. These results demonstrate the utility of this approach for determining high-confidence peptide-MHC binding.

  • A high-throughput yeast display approach to profile pathogen proteomes for MHC-II binding

    eLife · 2022-07-04 · 28 citations

    articleOpen access

    T cells play a critical role in the adaptive immune response, recognizing peptide antigens presented on the cell surface by major histocompatibility complex (MHC) proteins. While assessing peptides for MHC binding is an important component of probing these interactions, traditional assays for testing peptides of interest for MHC binding are limited in throughput. Here, we present a yeast display-based platform for assessing the binding of tens of thousands of user-defined peptides in a high-throughput manner. We apply this approach to assess a tiled library covering the SARS-CoV-2 proteome and four dengue virus serotypes for binding to human class II MHCs, including HLA-DR401, -DR402, and -DR404. While the peptide datasets show broad agreement with previously described MHC-binding motifs, they additionally reveal experimentally validated computational false positives and false negatives. We therefore present this approach as able to complement current experimental datasets and computational predictions. Further, our yeast display approach underlines design considerations for epitope identification experiments and serves as a framework for examining relationships between viral conservation and MHC binding, which can be used to identify potentially high-interest peptide binders from viral proteins. These results demonstrate the utility of our approach to determine peptide-MHC binding interactions in a manner that can supplement and potentially enhance current algorithm-based approaches.

Recent grants

Frequent coauthors

Education

  • Ph.D., Biomolecular Engineering

    Massachusetts Institute of Technology

    1995
  • B.S., Chemical Engineering

    University of California, Berkeley

    1990

Awards & honors

  • Wishnok Prize
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with David Gifford

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup