Jeff Shamma
· Professor and Jerry S. Dobrovolny Chair in ISEVerifiedUniversity of Illinois Urbana-Champaign · Industrial and Enterprise Systems Engineering
Active 1986–2026
About
Jeff S. Shamma is the Department Head of Industrial and Enterprise Systems Engineering and the Jerry S. Dobrovolny Chair at the University of Illinois at Urbana-Champaign. He holds a PhD in Systems Science and Engineering from MIT, earned in 1988. Prior to his current position, he held faculty appointments at the King Abdullah University of Science and Technology (KAUST) and the Georgia Institute of Technology, where he was the Julian T. Hightower Chair in Systems and Controls. Jeff Shamma is recognized as a Fellow of both IEEE and IFAC and has received several prestigious awards including the IFAC High Impact Paper Award, the AACC Donald P. Eckman Award, and the NSF Young Investigator Award. He has also served as a Distinguished Lecturer of the IEEE Control Systems Society and was the Editor-in-Chief of the IEEE Transactions on Control of Network Systems from 2020 to 2024. His research focuses on decision and control, game theory, and multi-agent systems. Jeff Shamma has been a plenary or semi-plenary speaker at numerous international conferences such as NeurIPS, the World Congress of the Game Theory Society, the IEEE Conference on Decision and Control, and the American Control Conference. His work spans theoretical and applied aspects of control systems, with a particular emphasis on learning dynamics in games, distributed control, and multi-agent reinforcement learning. Throughout his career, he has contributed significantly to the understanding of control-theoretic approaches in networked and autonomous systems, as well as the development of algorithms for distributed learning and decision-making in complex environments.
Research topics
- Computer Science
- Artificial Intelligence
- Engineering
- Electrical engineering
- Control engineering
- Computer network
- Telecommunications
- Geography
Selected publications
Convergence of Payoff-Based Higher-Order Replicator Dynamics in Contractive Games
ArXiv.org · 2026-03-18
articleOpen accessSenior authorWe study the convergence properties of a payoff-based higher-order version of replicator dynamics, a widely studied model in evolutionary dynamics and game-theoretic learning, in contractive games. Recent work has introduced a control-theoretic perspective for analyzing the convergence of learning dynamics through passivity theory, leading to a classification of learning dynamics based on the passivity notion they satisfy, such as \textdelta-passivity, equilibrium-independent passivity, and incremental passivity. We leverage this framework for the study of higher-order replicator dynamics for contractive games, which form the complement of passive learning dynamics. Standard replicator dynamics can be represented as a cascade interconnection between an integrator and the softmax mapping. Payoff-based higher-order replicator dynamics include a linear time-invariant (LTI) system in parallel with the existing integrator. First, we show that if this added system is strictly passive and asymptotically stable, then the resulting learning dynamics converge locally to the Nash equilibrium in contractive games. Second, we establish global convergence properties using incremental stability analysis for the special case of symmetric matrix contractive games.
Convergence of Payoff-Based Higher-Order Replicator Dynamics in Contractive Games
arXiv (Cornell University) · 2026-03-18
preprintOpen accessSenior authorWe study the convergence properties of a payoff-based higher-order version of replicator dynamics, a widely studied model in evolutionary dynamics and game-theoretic learning, in contractive games. Recent work has introduced a control-theoretic perspective for analyzing the convergence of learning dynamics through passivity theory, leading to a classification of learning dynamics based on the passivity notion they satisfy, such as \textdelta-passivity, equilibrium-independent passivity, and incremental passivity. We leverage this framework for the study of higher-order replicator dynamics for contractive games, which form the complement of passive learning dynamics. Standard replicator dynamics can be represented as a cascade interconnection between an integrator and the softmax mapping. Payoff-based higher-order replicator dynamics include a linear time-invariant (LTI) system in parallel with the existing integrator. First, we show that if this added system is strictly passive and asymptotically stable, then the resulting learning dynamics converge locally to the Nash equilibrium in contractive games. Second, we establish global convergence properties using incremental stability analysis for the special case of symmetric matrix contractive games.
Passivity, No-Regret, and Convergent Learning in Contractive Games
ArXiv.org · 2025-03-28
preprintOpen accessSenior authorWe investigate the interplay between passivity, no-regret, and convergence in contractive games for various learning dynamic models and their higher-order variants. Our setting is continuous time. Building on prior work for replicator dynamics, we show that if learning dynamics satisfy a passivity condition between the payoff vector and the difference between its evolving strategy and any fixed strategy, then it achieves finite regret. We then establish that the passivity condition holds for various learning dynamics and their higher-order variants. Consequentially, the higher-order variants can achieve convergence to Nash equilibrium in cases where their standard order counterparts cannot, while maintaining a finite regret property. We provide numerical examples to illustrate the lack of finite regret of different evolutionary dynamic models that violate the passivity property. We also examine the fragility of the finite regret property in the case of perturbed learning dynamics. Continuing with passivity, we establish another connection between finite regret and passivity, but with the related equilibrium-independent passivity property. Finally, we present a passivity-based classification of dynamic models according to the various passivity notions they satisfy, namely, incremental passivity, $δ$-passivity, and equilibrium-independent passivity. This passivity-based classification provides a framework to analyze the convergence of learning dynamic models in contractive games.
Passivity, No-Regret, and Convergent Learning in Contractive Games
2025-12-09 · 1 citations
articleSenior authorWe investigate the interplay between passivity, no-regret, and convergence in contractive games for various learning dynamic models and their higher-order variants. Our setting is continuous time. Building on prior work for replicator dynamics, we show that if learning dynamics satisfy a passivity condition between the payoff vector and the difference between its evolving strategy and any fixed strategy, then it achieves finite regret. We then establish that the passivity condition holds for various learning dynamics and their higher-order variants. Consequentially, the higher-order variants can achieve convergence to Nash equilibrium in cases where their standard order counterparts cannot, while maintaining a finite regret property. We provide numerical examples to illustrate the lack of finite regret of different evolutionary dynamic models that violate the passivity property. We also examine the fragility of the finite regret property in the case of perturbed learning dynamics. Continuing with passivity, we establish another connection between finite regret and passivity, but with the related equilibrium-independent passivity property. Finally, we present a passivity-based classification of dynamic models according to the various passivity notions they satisfy, namely, incremental passivity, δ-passivity, and equilibrium-independent passivity. This passivity-based classification provides a framework to analyze the convergence of learning dynamic models in contractive games.
Distributed Dynamics and Stable Outcomes in Coalitional Games and B-Matchings
Dynamic Games and Applications · 2025-10-10 · 1 citations
articleOpen accessSenior authorAbstract Cooperative games model strategic settings in which agents coordinate to form partnerships and share payoffs to achieve mutually beneficial outcomes. These settings range from pairwise matchings to larger groups forming coalitions. In this paper, we address two setups of cooperative games: transferable utility coalitional games and bipartite B -matchings. For both settings, we propose distributed dynamics where agents form and break partnerships according to evolving internal aspiration levels that reflect self-interest. Our distributed dynamics require simple computations, limited memory, and minimal knowledge of the environment. We prove that these dynamics converge to stable outcomes analogous to the core, where no group of agents has an incentive to deviate from the proposed partnerships. We illustrate our dynamics through computational experiments on exchange networks. The simulations exhibit resilient behavior of the algorithms under message drops between the agents and dynamic entry and exit of agents.
2025-06-10
articleSenior authorWe propose a multi-agent model framework for crowd dynamics in pedestrian environments. The model integrates physical and psychological interactions among agents and with the environment, accounting for physical and social forces based on inter-agent and obstacle distances. It improves the physical characteristics of an agent by modeling the agent’s active and reactive torso rotation. These features capture an agent’s rotational and lateral movement induced by torque interactions between the agents and enable phenomena such as shouldering and navigating narrow corridors. The physical layer also includes a feedback control force, specifically Proportional-Integral (PI) feedback control, that enables the agent to track a desired velocity profile, reflecting directional intention and determination. The authority of this control force is determined by using upper limits on the allowable force magnitude, which is determined by the competitive index characteristic of each agent. The psychological layer models agent competitiveness, incorporating inter-agent interactions and responses to environmental hazards. The physical and psychological layers are coupled by the inter-agent distances and the psychological state, determining the feedback control authority on each agent’s pushing force and motion. To illustrate the novel behavior enabled by this model, we present extensive simulation scenarios highlighting the model’s layers and how the different parameters influence the crowd’s behavior.
Optimism as Risk-Seeking in Multi-Agent Reinforcement Learning
IEEE Control Systems Letters · 2025-12-17
articleRisk sensitivity has become a central theme in reinforcement learning (RL), where convex risk measures and robust formulations provide principled ways to model preferences beyond expected return. Recent extensions to multi-agent RL (MARL) have largely emphasized the risk-averse setting, prioritizing robustness to uncertainty. In cooperative MARL, however, such conservatism often leads to suboptimal equilibria, and a parallel line of work has shown that optimism can promote cooperation. Existing optimistic methods, though effective in practice, are typically heuristic and lack theoretical grounding. Building on the dual representation for convex risk measures, we propose a principled framework that interprets risk-seeking objectives as optimism. We introduce optimistic value functions, which formalize optimism as divergence-penalized risk-seeking evaluations. Building on this foundation, we derive a policy-gradient theorem for optimistic value functions, including explicit formulas for the entropic risk/KL-penalty setting, and develop decentralized optimistic actor-critic algorithms that implement these updates. Empirical results on cooperative benchmarks demonstrate that risk-seeking optimism consistently improves coordination over both risk-neutral baselines and heuristic optimistic methods. Our framework thus unifies risk-sensitive learning and optimism, offering a theoretically grounded and practically effective approach to cooperation in MARL.
Grand challenges in industrial and systems engineering
International Journal of Production Research · 2025-01-17 · 31 citations
articleOpen accessContemporary society faces a growing set of complex issues representing significant socioeconomic, health and well-being, environmental, and sustainability challenges. The discipline of industrial and systems engineering (ISE) can play an important role in addressing these issues. This paper identifies and discusses eight grand challenges for ISE. These grand challenges are (1) Artificial Intelligence (AI) For Business and Personal Use: Decision-Making and System Design and Operations, (2) Cybersecurity and Resilience, (3) Sustainability: Environment, Energy and Infrastructure, (4) Health Issues, (5) Social Issues, (6) Logistics and Supply Chain, (7) System Integration and Operations: Humans, Automation, and AI, and (8) Industrial and Systems Engineering Education. The discussed grand challenges were derived by accomplished ISE professionals who are the authors of this paper. The implications of the ISE grand challenges for education, training, research, and implementation of ISE principles and methodologies for the benefit of global society are discussed.
Hybrid Gradient-Based Policy Optimization for Sample-Efficient Policy Learning in Autonomous Systems
2025-07-08
articleThis paper introduces HyGIPO, a novel gradient-based iterative policy optimization technique designed for efficient policy learning in autonomous systems, especially in the presence of modeling errors. Performance of control algorithms for autonomous systems is often limited by mismatches between a simplified nominal model and a complex real system. To address this degradation, HyGIPO leverages a hybrid gradient optimization approach, combining gradients of dynamics from a nominal model with real-world data to optimize control policies. We apply this method to the quadcopter waypoint tracking problem, with the controller parameterized by a neural network, demonstrating its effectiveness in both simulation and hardware experiments. In simulation, HyGIPO rapidly learns the policy within a hundred samples, showing orders of magnitude higher sample efficiency compared to reinforcement learning methods. The hardware experiments further validate the method, achieving successful tracking results in just tens of samples.
Higher-Order Uncoupled Learning Dynamics and Nash Equilibrium
ArXiv.org · 2025-06-12
preprintOpen accessSenior authorWe study learnability of mixed-strategy Nash Equilibrium (NE) in general finite games using higher-order replicator dynamics as well as classes of higher-order uncoupled heterogeneous dynamics. In higher-order uncoupled learning dynamics, players have no access to utilities of opponents (uncoupled) but are allowed to use auxiliary states to further process information (higher-order). We establish a link between uncoupled learning and feedback stabilization with decentralized control. Using this association, we show that for any finite game with an isolated completely mixed-strategy NE, there exist higher-order uncoupled learning dynamics that lead (locally) to that NE. We further establish the lack of universality of learning dynamics by linking learning to the control theoretic concept of simultaneous stabilization. We construct two games such that any higher-order dynamics that learn the completely mixed-strategy NE of one of these games can never learn the completely mixed-strategy NE of the other. Next, motivated by imposing natural restrictions on allowable learning dynamics, we introduce the Asymptotic Best Response (ABR) property. Dynamics with the ABR property asymptotically learn a best response in environments that are asymptotically stationary. We show that the ABR property relates to an internal stability condition on higher-order learning dynamics. We provide conditions under which NE are compatible with the ABR property. Finally, we address learnability of mixed-strategy NE in the bandit setting using a bandit version of higher-order replicator dynamics.
Recent grants
Distributed Adaptive Systems: Feedback Control with Evolutionary Games
NSF · $240k · 2005–2007
Distributed Adaptive Systems: Feedback Control with Evolutionary Games
NSF · $197k · 2007–2009
Frequent coauthors
- 70 shared
Magnus Egerstedt
- 64 shared
Hao Zhang
Northeast Electric Power University
- 59 shared
Thomas Parisini
- 52 shared
João P. Hespanha
- 49 shared
Maria Elena Valcher
- 44 shared
Luca Zaccarian
- 39 shared
Andrea Serrani
- 37 shared
Jing Sun
Shaanxi University of Chinese Medicine
Labs
Awards & honors
- IFAC High Impact Paper Award (2020)
- Fellow, IFAC (2016)
- Mohammed Dahleh Distinguished Lecture Award (2013)
- Fellow, IEEE (2006)
- American Automatic Control Council Donald P. Eckman Award (1…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Jeff Shamma
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup