Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Ali Mesbah

Ali Mesbah

Verified

University of California, Berkeley · Department of Chemical and Biomolecular Engineering

Active 2001–2026

h-index32
Citations4.4k
Papers190110 last 5y
Funding$1.5M
See your match with Ali Mesbah — sign in to PhdFit.Sign in

About

Ali Mesbah is an Associate Professor and the Principal Investigator at Mesbah Lab at UC Berkeley. His research focuses on learning-based analysis and predictive control of uncertain systems. The lab's work involves advanced modeling and control techniques, including reinforcement learning, model predictive control, safe learning-enabled control, Bayesian optimization, and data-driven inverse design. The research projects under his guidance span a variety of applications such as low-temperature plasmas for nanomaterial synthesis, atomic layer etching on superconducting surfaces for quantum computing, and non-Markovian dynamical modeling for molecular systems. Ali Mesbah leads a diverse team of postdoctoral researchers, graduate students, and visiting scholars, contributing to the advancement of control engineering and applied mathematics in complex and uncertain environments.

Research signals

Five dimensions sourced from public faculty / publication signals. Sign in to compare against your own profile and see your match score.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Machine Learning
  • Nanotechnology
  • Physics
  • Engineering
  • Astrobiology
  • Biotechnology
  • Systems engineering
  • Biochemical engineering
  • Biology
  • Mathematics
  • Materials science

Selected publications

  • The Separation Principle and the Dual-Certainty Equivalence Gap in Model Predictive Control

    arXiv (Cornell University) · 2026-04-07

    articleOpen access

    Dual control addresses the trade-off between exploitation and exploration, where control inputs both regulate the system and generate informative data for estimation and identification. For certain problem classes, control and estimation can be designed independently without loss of optimality, a property known as the separation principle. However, in stochastic control problems with model uncertainty and constraints, this principle generally breaks down, and introduces the need for dual control. In this paper, we propose an information-weighted dual model predictive control (MPC) formulation and introduce metrics that quantify the dependence of the MPC policy on the uncertainty. We focus on parametric uncertainty in linear systems with Gaussian noise, though the metrics can be applied more broadly. Numerical results show that the dependence of the MPC policy on the posterior covariance is largest under high uncertainty and vanishes as the posterior covariance contracts, providing empirical evidence of the dual effect in closed loop. Moreover, the dual controller improves regulation performance and model accuracy compared to certainty-equivalent MPC.

  • Soft MPCritic: Amortized Model Predictive Value Iteration

    ArXiv.org · 2026-04-01

    articleOpen accessSenior author

    Reinforcement learning (RL) and model predictive control (MPC) offer complementary strengths, yet combining them at scale remains computationally challenging. We propose soft MPCritic, an RL-MPC framework that learns in (soft) value space while using sample-based planning for both online control and value target generation. soft MPCritic instantiates MPC through model predictive path integral control (MPPI) and trains a terminal Q-function with fitted value iteration, aligning the learned value function with the planner and implicitly extending the effective planning horizon. We introduce an amortized warm-start strategy that recycles planned open-loop action sequences from online observations when computing batched MPPI-based value targets. This makes soft MPCritic computationally practical, while preserving solution quality. soft MPCritic plans in a scenario-based fashion with an ensemble of dynamic models trained for next-step prediction accuracy. Together, these ingredients enable soft MPCritic to learn effectively through robust, short-horizon planning on classic and complex control tasks. These results establish soft MPCritic as a practical and scalable blueprint for synthesizing MPC policies in settings where policy extraction and direct, long-horizon planning may fail.

  • Soft MPCritic: Amortized Model Predictive Value Iteration

    arXiv (Cornell University) · 2026-04-01

    preprintOpen accessSenior author

    Reinforcement learning (RL) and model predictive control (MPC) offer complementary strengths, yet combining them at scale remains computationally challenging. We propose soft MPCritic, an RL-MPC framework that learns in (soft) value space while using sample-based planning for both online control and value target generation. soft MPCritic instantiates MPC through model predictive path integral control (MPPI) and trains a terminal Q-function with fitted value iteration, aligning the learned value function with the planner and implicitly extending the effective planning horizon. We introduce an amortized warm-start strategy that recycles planned open-loop action sequences from online observations when computing batched MPPI-based value targets. This makes soft MPCritic computationally practical, while preserving solution quality. soft MPCritic plans in a scenario-based fashion with an ensemble of dynamic models trained for next-step prediction accuracy. Together, these ingredients enable soft MPCritic to learn effectively through robust, short-horizon planning on classic and complex control tasks. These results establish soft MPCritic as a practical and scalable blueprint for synthesizing MPC policies in settings where policy extraction and direct, long-horizon planning may fail.

  • The Separation Principle and the Dual-Certainty Equivalence Gap in Model Predictive Control

    arXiv (Cornell University) · 2026-04-07

    preprintOpen access

    Dual control addresses the trade-off between exploitation and exploration, where control inputs both regulate the system and generate informative data for estimation and identification. For certain problem classes, control and estimation can be designed independently without loss of optimality, a property known as the separation principle. However, in stochastic control problems with model uncertainty and constraints, this principle generally breaks down, and introduces the need for dual control. In this paper, we propose an information-weighted dual model predictive control (MPC) formulation and introduce metrics that quantify the dependence of the MPC policy on the uncertainty. We focus on parametric uncertainty in linear systems with Gaussian noise, though the metrics can be applied more broadly. Numerical results show that the dependence of the MPC policy on the posterior covariance is largest under high uncertainty and vanishes as the posterior covariance contracts, providing empirical evidence of the dual effect in closed loop. Moreover, the dual controller improves regulation performance and model accuracy compared to certainty-equivalent MPC.

  • Machine learning-based investigation of the relationship between plasma emission spectra and biological responses <sup>*</sup>

    Journal of Physics D Applied Physics · 2026-01-02

    articleOpen access

    Abstract Cold atmospheric plasma (CAP) has shown promising potential across biomedical applications. However, translating these effects into predictable and reproducible outcomes remains challenging due to device variability and the complex and unknown interplay of reactive species. This study evaluates the potential of machine learning (ML) to reliably predict CAP biological results including 24 h MTT and 48 h MTT from optical emission spectroscopy (OES) data. Data-driven models correlate plasma characteristics with cell viability and metabolic activity outcomes in human dermal fibroblasts. Diverse ML models were employed for their differing capabilities in feature extraction from OES data, essential for assessing the predictive capability of ML models for the biological effects of CAP from OES data. To evaluate cross plasma-device transferability without information leakage, models were tested on two distinct CAP jet systems. While the models achieved high accuracy for the primary jet used in training, their performance degraded considerably when applied to data from the secondary jet. To assess whether the loss of cross-device predictability could in principle be restored through minimal domain calibration, we conducted a controlled fine-tuning ablation on the pre-trained models. Results show that while spectra cannot serve as a device-independent predictor for the acute (24 h MTT) outcome, they retain partial transferability for the delayed (48 h MTT) response when minimal calibration is performed and sufficiently expressive models are employed to extract the underlying transferable structure. Moreover, the analysis identified the plasma source frequency as the most influential operational predictor, followed by voltage and treatment time. In identifying the gas-phase free radicals with the highest impact on biological outcomes, spectral fingerprints show the most influential species contributed in cell viability.

  • User Preference Meets Pareto-Optimality in Multi-Objective Bayesian Optimization

    ArXiv.org · 2025-02-10

    preprintOpen access

    Incorporating user preferences into multi-objective Bayesian optimization (MOBO) allows for personalization of the optimization procedure. Preferences are often abstracted in the form of an unknown utility function, estimated through pairwise comparisons of potential outcomes. However, utility-driven MOBO methods can yield solutions that are dominated by nearby solutions, as non-dominance is not enforced. Additionally, classical MOBO commonly relies on estimating the entire Pareto-front to identify the Pareto-optimal solutions, which can be expensive and ignore user preferences. Here, we present a new method, termed preference-utility-balanced MOBO (PUB-MOBO), that allows users to disambiguate between near-Pareto candidate solutions. PUB-MOBO combines utility-based MOBO with local multi-gradient descent to refine user-preferred solutions to be near-Pareto-optimal. To this end, we propose a novel preference-dominated utility function that concurrently preserves user-preferences and dominance amongst candidate solutions. A key advantage of PUB-MOBO is that the local search is restricted to a (small) region of the Pareto-front directed by user preferences, alleviating the need to estimate the entire Pareto-front. PUB-MOBO is tested on three synthetic benchmark problems: DTLZ1, DTLZ2 and DH1, as well as on three real-world problems: Vehicle Safety, Conceptual Marine Design, and Car Side Impact. PUB-MOBO consistently outperforms state-of-the-art competitors in terms of proximity to the Pareto-front and utility regret across all the problems.

  • A view on learning robust goal-conditioned value functions: Interplay between RL and MPC

    Annual Reviews in Control · 2025-01-01 · 1 citations

    articleSenior authorCorresponding
  • A neural master equation framework for multiscale modeling of molecular processes: application to atomic-scale plasma processes

    npj Computational Materials · 2025-07-15 · 2 citations

    articleOpen accessSenior author

    Abstract Plasma-surface interactions (PSI) play a crucial role in microelectronics fabrication; however, their multiscale nature and array of complex, often unknown interactions make computational modeling of PSIs extremely difficult. To this end, we propose a general neural master equation (NME) framework that uses master equations to describe the dynamics of a molecular process, wherein neural networks learned from atomistic simulations represent unknown transitions between different system states. By leveraging the physics-based structure of master equations and data-driven state transitions, the NME framework promotes generalizability and physics interpretability, and can bridge disparate length and time scales. The framework is demonstrated for multiscale modeling of Si atomic layer etching and reactive ion etching, where the learned NME-based surface kinetic models exhibit good predictive and extrapolative capabilities for predicting experimentally relevant observables as a function of process parameters. The NME-based surface kinetic models obey physical constraints, which are violated in models based on neural ordinary differential equations. The proposed NME framework for multiscale modeling of molecular processes can pave the way for the discovery of new chemistries and materials in atomic-scale plasma processes.

  • Model-free Reinforcement Learning for Model-based Control: Towards Safe, Interpretable and Sample-efficient Agents

    ArXiv.org · 2025-07-17

    preprintOpen accessSenior author

    Training sophisticated agents for optimal decision-making under uncertainty has been key to the rapid development of modern autonomous systems across fields. Notably, model-free reinforcement learning (RL) has enabled decision-making agents to improve their performance directly through system interactions, with minimal prior knowledge about the system. Yet, model-free RL has generally relied on agents equipped with deep neural network function approximators, appealing to the networks' expressivity to capture the agent's policy and value function for complex systems. However, neural networks amplify the issues of sample inefficiency, unsafe learning, and limited interpretability in model-free RL. To this end, this work introduces model-based agents as a compelling alternative for control policy approximation, leveraging adaptable models of system dynamics, cost, and constraints for safe policy learning. These models can encode prior system knowledge to inform, constrain, and aid in explaining the agent's decisions, while deficiencies due to model mismatch can be remedied with model-free RL. We outline the benefits and challenges of learning model-based agents -- exemplified by model predictive control -- and detail the primary learning approaches: Bayesian optimization, policy search RL, and offline strategies, along with their respective strengths. While model-free RL has long been established, its interplay with model-based agents remains largely unexplored, motivating our perspective on their combined potentials for sample-efficient learning of safe and interpretable decision-making agents.

  • Nitrogen Fixation Optimization Strategy Based on a Tree‐Structured Parzen Estimator‐Multilayer Perceptron (TPE‐MLP) Model

    Plasma Processes and Polymers · 2025-06-29

    article

    ABSTRACT Atmospheric‐pressure non‐thermal plasma has shown great potential in the field of nitrogen fixation (NF). In this study, a Magnetic Field Stabilized Glow Discharge (MSGD) device is employed, utilizing Lorentz forces generated by an external magnetic field to stabilize the plasma channel. It enhances gas utilization and reduces energy cost compared to traditional gliding arc discharges. To address the challenge of plasma parameters involved, experimentally optimizing NF energy cost across numerous plasma parameters, a Multilayer Perceptron (MLP) model is trained to predict NF energy cost and NO X concentration and energy cost. Hyperparameters are optimized using the tree‐structured Parzen estimator (TPE), and gradient analysis is conducted to evaluate feature importance. The results demonstrate that this approach enables accurate prediction and offers an effective strategy for optimizing plasma‐based NF processes.

Recent grants

Frequent coauthors

Labs

Education

  • PhD in Systems and Control

    Delft University of Technology

  • Senior Postdoctoral Associate

    Massachusetts Institute of Technology

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Ali Mesbah

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup