
Marco Pavone
· Associate Professor of Aeronautics and Astronautics, Senior Fellow at the Precourt Institute for Energy and Associate Professor, by courtesy, of Electrical Engineering & of Computer ScienceVerifiedStanford University · Aeronautics and Astronautics
Active 1965–2026
About
Marco Pavone is an Associate Professor of Aeronautics and Astronautics at Stanford University. He is also a Senior Fellow at the Precourt Institute for Energy and holds courtesy appointments in Electrical Engineering and Computer Science. His research focuses on autonomous systems, energy, and control, contributing to advancements in these fields through his academic and professional work.
Research topics
- Computer Science
- Business
- Artificial Intelligence
- Mathematics
- Process management
- Quantum mechanics
- Systems engineering
- Engineering
- Physics
- Risk analysis (engineering)
- Statistical physics
- Transport engineering
- Operations management
- Mathematical optimization
- Software engineering
Selected publications
Ocean Engineering · 2026-05-18
articleOpen accessThe draft IMO MASS Code requires autonomous and remotely supervised maritime vessels to detect departures from their operational design domain, enter a predefined fallback that notifies the operator, permit immediate human override, and avoid changing the voyage plan without approval. Meeting these obligations in the alert-to-takeover gap calls for a short-horizon, human-overridable fallback maneuver. Classical maritime autonomy stacks struggle when the correct action depends on meaning (e.g., diver-down flag means people in the water, fire close by means hazard). We argue (i) that vision-language models (VLMs) provide semantic awareness for such out-of-distribution situations, and (ii) that a fast-slow anomaly pipeline with a short-horizon, human-overridable fallback maneuver makes this practical in the handover window. We introduce Semantic Lookout, a camera-only, candidate-constrained VLM fallback maneuver selector that selects one cautious action (or station-keeping) from water-valid, world-anchored trajectories under continuous human authority. On 40 harbor scenes we measure per-call scene understanding and latency, alignment with human consensus (model majority-of-three voting), short-horizon risk-relief on fire hazard scenes, and an on-water alert->fallback maneuver->operator handover. Sub-10 s models retain most of the awareness of slower state-of-the-art models. The fallback maneuver selector outperforms geometry-only baselines and increases standoff distance on fire scenes. A field run verifies end-to-end operation. These results support VLMs as semantic fallback maneuver selectors compatible with the draft IMO MASS Code, within practical latency budgets, and motivate future work on domain-adapted, hybrid autonomy that pairs foundation-model semantics with multi-sensor bird's-eye-view perception and short-horizon replanning. Website: kimachristensen.github.io/bridge_policy
Agile Tradespace Exploration for Space Rendezvous Mission Design via Transformers
2026-03-07
articleOpen accessSpacecraft rendezvous enables on-orbit servicing, debris removal, and crewed docking, forming the foundation for a scalable space economy. Designing such missions requires rapid exploration of the tradespace between control cost and flight time across multiple candidate targets. However, multi-objective optimization in this setting is challenging, as the underlying constraints are often nonconvex, and mission designers must balance accuracy (e.g., solving the full problem) with efficiency (e.g., convex relaxations), slowing iteration and limiting design agility. To address these challenges, this paper proposes an AI-powered framework that enables agile and generalized rendezvous mission design. Given the orbital information of the target spacecraft, boundary conditions of the servicer, and a range of flight times, a transformer model generates a set of near-Pareto optimal trajectories across varying flight times in a single parallelized inference step, thereby enabling rapid mission trade studies. The model is further extended to accommodate variable flight times and perturbed orbital dynamics, supporting realistic multi-objective trade-offs. Validation on chance-constrained rendezvous problems in Earth orbits with passive safety constraints demonstrates that the model generalizes across both flight times and dynamics, consistently providing high-quality initial guesses that converge to superior solutions in fewer iterations. Moreover, the framework efficiently approximates the Pareto front, achieving runtimes comparable to convex relaxation by exploiting parallelized inference. Together, these results position the proposed framework as a practical surrogate for nonconvex trajectory generation and mark an important step toward AI-driven trajectory design for accelerating preliminary mission planning in real-world rendezvous applications.
Elevating Variational Quantum Semidefinite Programs for Polynomial Objectives
Quantum · 2026-04-21
preprintOpen accessMany practically important NP-hard optimization problems are inherently higher-order polynomial optimizations, which are typically addressed using approximation algorithms. Classical relaxations express polynomial objectives over a polynomial basis and solve the resulting quadratic objective as a semidefinite program, which can significantly inflate problem size and degrade approximation behavior. Variational quantum analogues to classical semidefinite programs (vQSDPs) are near-term formulations geared towards quadratic objectives. We introduce Product-State Lifting (PSL), a simple product-register encoding that upgrades any vQSDP with basis-state encoding to tackle <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>k</mml:mi> </mml:math> -degree polynomial optimization. This upgrade requires only a linear increase in resources with constraints constant in <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>k</mml:mi> </mml:math> . As a worked example, we pair PSL with the recently-proposed vQSDP with the Hadamard test and approximate amplitude constraints [Quantum 7, 1057 (2023)], and outline an application to Max- <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mi>k</mml:mi> </mml:math> SAT. PSL maintains the device-friendly structure of vQSDPs while making polynomial degree a linear resource parameter, offering a general path from quadratic to polynomial optimization without the constraint growth typical of classical relaxations.
Robo-taxi Fleet Coordination at Scale via Reinforcement Learning
IEEE Transactions on Control of Network Systems · 2026-01-01
articleOpen accessSenior authorFleets of robo-taxis offering on-demand transportation services, commonly known as Autonomous Mobility-on-Demand (AMoD) systems, hold significant promise for societal benefits, such as reducing pollution, energy consumption, and urban congestion. However, orchestrating these systems at scale remains a critical challenge, with existing coordination algorithms often failing to exploit the systems' full potential. This work introduces a novel decision-making framework that unites mathematical modeling with data-driven techniques. In particular, we present the AMoD coordination problem through the lens of reinforcement learning and propose a graph network-based framework that exploits the main strengths of graph representation learning, reinforcement learning, and classical operations research tools. Extensive evaluations across diverse simulation fidelities and scenarios demonstrate the flexibility of our approach, achieving superior system performance, computational efficiency, and generalizability compared to prior methods. Finally, motivated by the need to democratize research efforts in this area, we release publicly available benchmarks, datasets, and simulators for network-level coordination alongside an open-source codebase designed to provide accessible simulation platforms and establish a standardized validation process for comparing methodologies.
Trends in motion prediction toward deployable and generalizable autonomy: a revisit and perspectives
Foundations and Trends in Robotics · 2026-04-21
articleOpen accessMotion prediction, recently popularized under the term world models, refers to anticipating the future states of agents or the future evolution of a scene, which is rooted in human cognition to bridge perception and decision-making, enabling us to anticipate, adapt, and act within an everchanging world. It lies at the core of intelligent autonomous systems, such as robotics and self-driving cars, to safely operate in dynamic and human-robot-mixed environments, and also informs broader time-series challenges. With advances in methods, representations, and datasets, the field has seen rapid progress, reflected in rapidly updated benchmark performance. However, when state-of-the-art methods are deployed in the real world, they are often found to struggle to generalize to open-world settings and fall short of deployment standards. This reveals a gap between reality and benchmarks, which are often idealized or ill-posed, and fail to capture real-world complexity. To address the pressing need for problem settings that better reflect real-world challenges and guide future research, this paper focuses on revisiting the generalization and applicability of motion prediction models, with an emphasis on robotics, autonomous driving, and human motion applications. We first provide a comprehensive taxonomy of motion prediction methods, covering representations, modelling methods, application domains, and evaluation protocols. We then revisit two fundamental problems: 1) how to push motion prediction models to be deployable to realistic deployment standards, where motion prediction does not act in a vacuum, but functions as one module of closed-loop autonomy stacks – it takes input from the localization and perception, and informs downstream planning and control. 2) how to generalize motion prediction models from limited seen scenarios/datasets to the open-world settings. We conclude by highlighting crucial challenges and open problems for future research. By doing so, we aim to recalibrate the community’s efforts, fostering progress that is not only measurable but also meaningful for real-world applications. The project webpage corresponding to this paper can be found here Link to the website of Trends in Motion Prediction Toward Deployable and Generalizable Autonomy: A Revisit and Perspectives
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis
IEEE Open Journal of Intelligent Transportation Systems · 2026-01-01 · 8 citations
articleOpen accessEnsuring the safety of autonomous vehicles in real-world environments requires handling a wide spectrum of diverse and rare driving scenarios. Scenario-based testing addresses this need by offering a scalable and controlled approach to develop and validate autonomous driving systems. However, traditional scenario generation methods relying on rule-based logic, knowledge-driven models, or data-driven synthesis often yield limited diversity and unrealistic cases. With the emergence of foundation models, which represent a new generation of pre-trained, general-purpose Artificial Intelligence (AI) models, developers can process heterogeneous inputs (e.g., natural language, sensor data, maps, and control actions), enabling the synthesis, interpretation, analysis of complex driving scenarios. In this paper, we review the use of foundation models for scenario generation and scenario analysis in autonomous driving. Our survey presents a unified taxonomy that includes large language models, vision language models, multimodal large language models, diffusion models, and world models for the generation and analysis of autonomous driving scenarios, outlining their fundamental principles, applications, and corresponding evaluation metrics. In addition, we review the methodologies, open-source datasets, simulation platforms, and benchmark challenges. Finally, the survey concludes by highlighting the open challenges, research questions and promising future directions in applying foundation models to scenario generation and analysis in autonomous driving. All reviewed papers are listed in a continuously maintained repository, which is publicly available and updated with new research: GitHub.com/TUM-AVS/FM-for-Scenario-Generation-Analysis.
The Case for Negative Data: From Crash Reports to Counterfactuals for Reasonable Driving
ArXiv.org · 2025-09-23
preprintOpen accessSenior authorLearning-based autonomous driving systems are trained mostly on incident-free data, offering little guidance near safety-performance boundaries. Real crash reports contain precisely the contrastive evidence needed, but they are hard to use: narratives are unstructured, third-person, and poorly grounded to sensor views. We address these challenges by normalizing crash narratives to ego-centric language and converting both logs and crashes into a unified scene-action representation suitable for retrieval. At decision time, our system adjudicates proposed actions by retrieving relevant precedents from this unified index; an agentic counterfactual extension proposes plausible alternatives, retrieves for each, and reasons across outcomes before deciding. On a nuScenes benchmark, precedent retrieval substantially improves calibration, with recall on contextually preferred actions rising from 24% to 53%. The counterfactual variant preserves these gains while sharpening decisions near risk.
Multi-Timescale Model Predictive Control for Slow-Fast Systems
ArXiv.org · 2025-11-18
preprintOpen accessSenior authorModel Predictive Control (MPC) has established itself as the primary methodology for constrained control, enabling autonomy across diverse applications. While model fidelity is crucial in MPC, solving the corresponding optimization problem in real time remains challenging when combining long horizons with high-fidelity models that capture both short-term dynamics and long-term behavior. Motivated by results on the Exponential Decay of Sensitivities (EDS), which imply that, under certain conditions, the influence of modeling inaccuracies decreases exponentially along the prediction horizon, this paper proposes a multi-timescale MPC scheme for fast-sampled control. Tailored to systems with both fast and slow dynamics, the proposed approach improves computational efficiency by i) switching to a reduced model that captures only the slow, dominant dynamics and ii) exponentially increasing integration step sizes to progressively reduce model detail along the horizon. We evaluate the method on three practically motivated robotic control problems in simulation and observe speed-ups of up to an order of magnitude.
ArXiv.org · 2025-10-30
preprintOpen accessSenior authorEnd-to-end architectures trained via imitation learning have advanced autonomous driving by scaling model size and data, yet performance remains brittle in safety-critical long-tail scenarios where supervision is sparse and causal understanding is limited. We introduce Alpamayo-R1 (AR1), a vision-language-action model (VLA) that integrates Chain of Causation reasoning with trajectory planning for complex driving scenarios. Our approach features three key innovations: (1) the Chain of Causation (CoC) dataset, built through a hybrid auto-labeling and human-in-the-loop pipeline producing decision-grounded, causally linked reasoning traces aligned with driving behaviors; (2) a modular VLA architecture combining Cosmos-Reason, a vision-language model pre-trained for Physical AI, with a diffusion-based trajectory decoder that generates dynamically feasible trajectories in real time; (3) a multi-stage training strategy using supervised fine-tuning to elicit reasoning and reinforcement learning (RL) to enforce reasoning-action consistency and optimize reasoning quality. AR1 achieves up to a 12% improvement in planning accuracy on challenging cases compared to a trajectory-only baseline, with a 35% reduction in close encounter rate in closed-loop simulation. RL post-training improves reasoning quality by 45% and reasoning-action consistency by 37%. Model scaling from 0.5B to 7B parameters shows consistent improvements. On-vehicle road tests confirm real-time performance (99 ms latency) and successful urban deployment. By bridging interpretable reasoning with precise control, AR1 demonstrates a practical path towards Level 4 autonomous driving. Model weights are available at https://huggingface.co/nvidia/Alpamayo-R1-10B with inference code at https://github.com/NVlabs/alpamayo.
Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
2025-10-19
preprintOpen accessGender bias in vision-language foundation models (VLMs) raises concerns about their safe deployment and is typically evaluated using benchmarks with gender annotations on real-world images. However, as these benchmarks often contain spurious correlations between gender and non-gender features, such as objects and backgrounds, we identify a critical oversight in gender bias evaluation: Do spurious features distort gender bias evaluation? To address this question, we systematically perturb non-gender features across four widely used benchmarks (COCO-gender, FACET, MIAP, and PHASE) and various VLMs to quantify their impact on bias evaluation. Our findings reveal that even minimal perturbations, such as masking just 10% of objects or weakly blurring backgrounds, can dramatically alter bias scores, shifting metrics by up to 175% in generative VLMs and 43% in CLIP variants. This suggests that current bias evaluations often reflect model responses to spurious features rather than gender bias, undermining their reliability. Since creating spurious feature-free benchmarks is fundamentally challenging, we recommend reporting bias metrics alongside feature-sensitivity measurements to enable a more reliable bias assessment.
Recent grants
NSF · $350k · 2019–2022
NRI: INT: COLLAB: Synergetic Drone Delivery Network in Metropolis
NSF · $287k · 2018–2022
NSF · $500k · 2015–2021
NSF · $300k · 2019–2022
Frequent coauthors
- 127 shared
Edward Schmerling
- 122 shared
Boris Ivanovic
- 91 shared
Emilio Frazzoli
- 71 shared
Federico Rossi
Jet Propulsion Laboratory
- 71 shared
Riccardo Bonalli
- 66 shared
Karen Leung
- 62 shared
Thomas Lew
- 49 shared
Lucas Janson
Harvard University
Awards & honors
- Presidential Early Career Award for Scientists and Engineers…
- Office of Naval Research Young Investigator Award
- National Science Foundation Early Career (CAREER) Award
- NASA Early Career Faculty Award
- Early-Career Spotlight Award from the Robotics Science and S…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Marco Pavone
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup