Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Tommi Jaakkola

Tommi Jaakkola

Verified

Massachusetts Institute of Technology · Electrical Engineering & Computer Science

Active 1994–2026

h-index92
Citations40.4k
Papers475155 last 5y
Funding
See your match with Tommi Jaakkola — sign in to PhdFit.Sign in

About

Tommi Jaakkola is the Thomas M. Siebel Distinguished Professor at MIT, specializing in Artificial Intelligence and Decision-making within the Department of Electrical Engineering and Computer Science. His research areas include AI for Healthcare and Life Sciences, Artificial Intelligence and Machine Learning, and Natural Language and Speech Processing. He focuses on developing techniques for the analysis and synthesis of systems that interact with the external world through perception, communication, and action, while also learning, making decisions, and adapting to changing environments. His work involves leveraging computational, theoretical, and experimental tools to advance groundbreaking sensors, energy transducers, physical substrates for computation, and systems addressing shared human challenges.

Research topics

  • Artificial Intelligence
  • Computer Science
  • Machine Learning
  • Computational biology
  • Biology
  • Chemistry
  • Genetics
  • Computer Security
  • Data Mining
  • Psychology
  • Cognitive science
  • Nanotechnology
  • Theoretical computer science
  • Bioinformatics
  • Evolutionary biology
  • Biochemistry
  • Materials science
  • Engineering
  • Microbiology
  • Data science
  • Mathematics
  • Physics

Selected publications

  • AI-based methods for simulating, sampling, and predicting protein ensembles

    Current Opinion in Structural Biology · 2026-04-02

    articleSenior author
  • Rethinking Diffusion Models with Symmetries through Canonicalization with Applications to Molecular Graph Generation

    Open MIND · 2026-02-16

    preprintSenior author

    Many generative tasks in chemistry and science involve distributions invariant to group symmetries (e.g., permutation and rotation). A common strategy enforces invariance and equivariance through architectural constraints such as equivariant denoisers and invariant priors. In this paper, we challenge this tradition through the alternative canonicalization perspective: first map each sample to an orbit representative with a canonical pose or order, train an unconstrained (non-equivariant) diffusion or flow model on the canonical slice, and finally recover the invariant distribution by sampling a random symmetry transform at generation time. Building on a formal quotient-space perspective, our work provides a comprehensive theory of canonical diffusion by proving: (i) the correctness, universality and superior expressivity of canonical generative models over invariant targets; (ii) canonicalization accelerates training by removing diffusion score complexity induced by group mixtures and reducing conditional variance in flow matching. We then show that aligned priors and optimal transport act complementarily with canonicalization and further improves training efficiency. We instantiate the framework for molecular graph generation under $S_n \times SE(3)$ symmetries. By leveraging geometric spectra-based canonicalization and mild positional encodings, canonical diffusion significantly outperforms equivariant baselines in 3D molecule generation tasks, with similar or even less computation. Moreover, with a novel architecture Canon, CanonFlow achieves state-of-the-art performance on the challenging GEOM-DRUG dataset, and the advantage remains large in few-step generation.

  • Rethinking Diffusion Models with Symmetries through Canonicalization with Applications to Molecular Graph Generation

    ArXiv.org · 2026-02-16

    articleOpen accessSenior author

    Many generative tasks in chemistry and science involve distributions invariant to group symmetries (e.g., permutation and rotation). A common strategy enforces invariance and equivariance through architectural constraints such as equivariant denoisers and invariant priors. In this paper, we challenge this tradition through the alternative canonicalization perspective: first map each sample to an orbit representative with a canonical pose or order, train an unconstrained (non-equivariant) diffusion or flow model on the canonical slice, and finally recover the invariant distribution by sampling a random symmetry transform at generation time. Building on a formal quotient-space perspective, our work provides a comprehensive theory of canonical diffusion by proving: (i) the correctness, universality and superior expressivity of canonical generative models over invariant targets; (ii) canonicalization accelerates training by removing diffusion score complexity induced by group mixtures and reducing conditional variance in flow matching. We then show that aligned priors and optimal transport act complementarily with canonicalization and further improves training efficiency. We instantiate the framework for molecular graph generation under $S_n \times SE(3)$ symmetries. By leveraging geometric spectra-based canonicalization and mild positional encodings, canonical diffusion significantly outperforms equivariant baselines in 3D molecule generation tasks, with similar or even less computation. Moreover, with a novel architecture Canon, CanonFlow achieves state-of-the-art performance on the challenging GEOM-DRUG dataset, and the advantage remains large in few-step generation.

  • PackFlow: Generative Molecular Crystal Structure Prediction via Reinforcement Learning Alignment

    ArXiv.org · 2026-01-01

    articleOpen access

    Organic molecular crystals underpin technologies ranging from pharmaceuticals to organic electronics, yet predicting solid-state packing of molecules remains challenging because candidate generation is combinatorial and stability is only resolved after costly energy evaluations. Here we introduce PackFlow, a flow matching framework for molecular crystal structure prediction (CSP) that generates heavy-atom crystal proposals by jointly sampling Cartesian coordinates and unit-cell lattice parameters given a molecular graph. This lattice-aware generation interfaces directly with downstream relaxation and lattice-energy ranking, positioning PackFlow as a scalable proposal engine within standard CSP pipelines. To explicitly steer generation toward physically favourable regions, we propose physics alignment, a reinforcement learning post-training stage that uses machine-learned interatomic potential energies and forces as stability proxies. Physics alignment improves physical validity without altering inference-time sampling. We validate PackFlow's performance against heuristic baselines through two distinct evaluations. First, on a broad unseen set of molecular systems, we demonstrate superior candidate generation capability, with proposals exhibiting greater structural similarity to experimental polymorphs. Second, we assess the full end-to-end workflow on two unseen CSP blind-test case studies, including relaxation and lattice-energy analysis. In both settings, PackFlow outperforms heuristics-based methods by concentrating probability mass in low-energy basins, yielding candidates that relax into lower-energy minima and offering a practical route to amortize the relax-and-rank bottleneck.

  • FragmentFlow: Scalable Transition State Generation for Large Molecules

    Open MIND · 2026-02-02

    preprintSenior author

    Transition states (TSs) are central to understanding and quantitatively predicting chemical reactivity and reaction mechanisms. Although traditional TS generation methods are computationally expensive, recent generative modeling approaches have enabled chemically meaningful TS prediction for relatively small molecules. However, these methods fail to generalize to practically relevant reaction substrates because of distribution shifts induced by increasing molecular sizes. Furthermore, TS geometries for larger molecules are not available at scale, making it infeasible to train generative models from scratch on such molecules. To address these challenges, we introduce FragmentFlow: a divide-and-conquer approach that trains a generative model to predict TS geometries for the reactive core atoms, which define the reaction mechanism. The full TS structure is then reconstructed by re-attaching substituent fragments to the predicted core. By operating on reactive cores, whose size and composition remain relatively invariant across molecular contexts, FragmentFlow mitigates distribution shifts in generative modeling. Evaluated on a new curated dataset of reactions involving reactants with up to 33 heavy atoms, FragmentFlow correctly identifies 90% of TSs while requiring 30% fewer saddle-point optimization steps than classical initialization schemes. These results point toward scalable TS generation for high-throughput reactivity studies.

  • Protein FID: improved evaluation of protein structure generative models

    Bioinformatics · 2026-04-01

    articleOpen access

    MOTIVATION: Protein structure generative models have seen a recent surge of interest, but meaningfully evaluating them computationally is an active area of research. While current metrics have driven useful progress, they do not capture how well models sample the design space represented by the training data. We argue for a protein Frechet Inception Distance (FID) metric to supplement current evaluations with a measure of distributional similarity in a semantically meaningful latent space. RESULTS: Our FID behaves desirably under protein structure perturbations and correctly recapitulates similarities between protein samples: it correlates with optimal transport distances and recovers FoldSeek clusters and the CATH hierarchy. Evaluating current protein structure generative models with FID shows that they fall short of modeling the distribution of PDB proteins. AVAILABILITY: Code is available at: https://github.com/ffaltings/protfid.

  • Turbulence teaches equivariance to neural networks

    arXiv (Cornell University) · 2026-02-04

    articleOpen access

    We investigate how the rotational nature of turbulence affects learned mappings between quantities governed by the Navier-Stokes equations. By varying the degree of anisotropy in a turbulence dataset, we explore how statistical symmetry affects these mappings. To do this, we train super-resolution models at different wall-normal locations in a channel flow, where anisotropy varies naturally, and test their generalization. By evaluating the learned mappings on new coordinate frames and new flow conditions, we find that coordinate-frame generalization is a key part of the generalization problem. Turbulent flows naturally present a wide range of local orientations, so respecting the symmetries of the Navier-Stokes equations improves generalization to new flows. Importantly, turbulence's rotational structure can embed these symmetries into learned mappings -- an effect that strengthens with isotropy and dataset size. This is because a more isotropic dataset samples a wider range of orientations, more fully covering the rotational symmetries of the Navier-Stokes equations. The dependence on isotropy means equivariance error is also scale-dependent, consistent with Kolmogorov's hypothesis. Therefore, turbulence provides its own data augmentation (we term this implicit data augmentation). We expect this effect to apply broadly to learned mappings between tensorial flow quantities, making it relevant to most machine learning applications in turbulence.

  • FragmentFlow: Scalable Transition State Generation for Large Molecules

    ArXiv.org · 2026-02-02

    articleOpen accessSenior author

    Transition states (TSs) are central to understanding and quantitatively predicting chemical reactivity and reaction mechanisms. Although traditional TS generation methods are computationally expensive, recent generative modeling approaches have enabled chemically meaningful TS prediction for relatively small molecules. However, these methods fail to generalize to practically relevant reaction substrates because of distribution shifts induced by increasing molecular sizes. Furthermore, TS geometries for larger molecules are not available at scale, making it infeasible to train generative models from scratch on such molecules. To address these challenges, we introduce FragmentFlow: a divide-and-conquer approach that trains a generative model to predict TS geometries for the reactive core atoms, which define the reaction mechanism. The full TS structure is then reconstructed by re-attaching substituent fragments to the predicted core. By operating on reactive cores, whose size and composition remain relatively invariant across molecular contexts, FragmentFlow mitigates distribution shifts in generative modeling. Evaluated on a new curated dataset of reactions involving reactants with up to 33 heavy atoms, FragmentFlow correctly identifies 90% of TSs while requiring 30% fewer saddle-point optimization steps than classical initialization schemes. These results point toward scalable TS generation for high-throughput reactivity studies.

  • Zatom-1: Towards a Multimodal Foundation Model for 3D Molecules and Materials

    arXiv (Cornell University) · 2026-02-24

    preprintOpen access

    General-purpose 3D modeling in chemistry encompasses molecules and materials, requiring both generative and predictive capabilities. However, most existing AI approaches are optimized for a single domain (molecules or materials) and a single task (generation or prediction), which limits representation sharing and transfer. We introduce Zatom-1, a cross-domain, general-purpose model architecture that unifies generative and predictive learning of 3D molecules and materials. Zatom-1 is a deliberately simplified Transformer trained with a multimodal flow matching objective that jointly models discrete atom types and continuous 3D geometries. This approach supports scalable pretraining with predictable gains as model capacity increases, while enabling fast and stable sampling. We use cross-domain generative pretraining as a universal initialization for downstream multi-task prediction of properties, energies, and forces. Empirically, Zatom-1 outperforms or competes with specialized baselines on both multi-task generative and predictive benchmarks in data-controlled settings, while improving generative inference speed by more than an order of magnitude. Our experiments demonstrate positive predictive transfer between data domains from joint generative pretraining: modeling materials during generative pretraining improves molecular property prediction accuracy. Open-source code and model weights are freely available at https://github.com/Zatom-AI/zatom.

  • Turbulence teaches equivariance to neural networks

    Open MIND · 2026-02-04

    preprint

    We investigate how the rotational nature of turbulence affects learned mappings between quantities governed by the Navier-Stokes equations. By varying the degree of anisotropy in a turbulence dataset, we explore how statistical symmetry affects these mappings. To do this, we train super-resolution models at different wall-normal locations in a channel flow, where anisotropy varies naturally, and test their generalization. By evaluating the learned mappings on new coordinate frames and new flow conditions, we find that coordinate-frame generalization is a key part of the generalization problem. Turbulent flows naturally present a wide range of local orientations, so respecting the symmetries of the Navier-Stokes equations improves generalization to new flows. Importantly, turbulence's rotational structure can embed these symmetries into learned mappings -- an effect that strengthens with isotropy and dataset size. This is because a more isotropic dataset samples a wider range of orientations, more fully covering the rotational symmetries of the Navier-Stokes equations. The dependence on isotropy means equivariance error is also scale-dependent, consistent with Kolmogorov's hypothesis. Therefore, turbulence provides its own data augmentation (we term this implicit data augmentation). We expect this effect to apply broadly to learned mappings between tensorial flow quantities, making it relevant to most machine learning applications in turbulence.

Frequent coauthors

Labs

  • MIT EECS Artificial Intelligence + Decision-making LabPI

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Tommi Jaakkola

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup