
Sean Sinclair
· Assistant Professor of Industrial Engineering and Management SciencesVerifiedNorthwestern University · Chemical Engineering
Active 2017–2025
About
Sean Sinclair is an Assistant Professor of Industrial Engineering and Management Sciences at Northwestern University. His research focuses on developing algorithms for data-driven sequential decision-making with applications to societal systems. His work bridges algorithmic techniques in reinforcement learning with an operations management perspective, addressing challenges related to data uncertainty, model design, and multi-objective trade-offs. Sinclair's recent contributions include establishing instance-specific optimal regret guarantees for nonparametric reinforcement learning, designing Pareto-optimal fair resource allocation methods, and creating data-efficient algorithms for cloud compute allocation. He emphasizes empirical analysis by designing open-source tools to evaluate the multi-criteria performance of these algorithms, contributing to both theoretical advancements and practical applications in the field.
Research topics
- Computer Science
- Artificial Intelligence
- Mathematical optimization
- Mathematics
- Economics
- Machine Learning
- Microeconomics
- Social psychology
- Operations research
- Psychology
- Mathematical economics
- Algorithm
- Computer network
Selected publications
Network and Risk Analysis of Surety Bonds
ArXiv.org · 2025-11-07
preprintOpen accessSenior authorSurety bonds are financial agreements between a contractor (principal) and obligee (project owner) to complete a project. However, most large-scale projects involve multiple contractors, creating a network and introducing the possibility of incomplete obligations to propagate and result in project failures. Typical models for risk assessment assume independent failure probabilities within each contractor. However, we take a network approach, modeling the contractor network as a directed graph where nodes represent contractors and project owners and edges represent contractual obligations with associated financial records. To understand risk propagation throughout the contractor network, we extend the celebrated Friedkin-Johnsen model and introduce a stochastic process to simulate principal failures across the network. From a theoretical perspective, we show that under natural monotonicity conditions on the contractor network, incorporating network effects leads to increases in the average risk for the surety organization. We further use data from a partnering insurance company to validate our findings, estimating an approximately 2% higher exposure when accounting for network effects.
Sequential Fair Allocation With Replenishments: A Little Envy Goes An Exponentially Long Way
ArXiv.org · 2025-08-29
preprintOpen accessWe study the trade-off between envy and inefficiency in repeated resource allocation settings with stochastic replenishments, motivated by real-world systems such as food banks and medical supply chains. Specifically, we consider a model in which a decision-maker faced with stochastic demand and resource donations must trade off between an equitable and efficient allocation of resources over an infinite horizon. The decision-maker has access to storage with fixed capacity $M$, and incurs efficiency losses when storage is empty (stockouts) or full (overflows). We provide a nearly tight (up to constant factors) characterization of achievable envy-inefficiency pairs. Namely, we introduce a class of Bang-Bang control policies whose inefficiency exhibits a sharp phase transition, dropping from $Θ(1/M)$ when $Δ= 0$ to $e^{-Ω(ΔM)}$ when $Δ> 0$, where $Δ$ is used to denote the target envy of the policy. We complement this with matching lower bounds, demonstrating that the trade-off is driven by supply, as opposed to demand uncertainty. Our results demonstrate that envy-inefficiency trade-offs not only persist in settings with dynamic replenishment, but are shaped by the decision-maker's available capacity, and are therefore qualitatively different compared to previously studied settings with fixed supply.
Adaptivity, Structure, and Objectives in Sequential Decision-Making
ACM SIGMETRICS Performance Evaluation Review · 2024-01-03
article1st authorCorrespondingSequential decision-making algorithms are ubiquitous in the design and optimization of large-scale systems due to their practical impact. The typical algorithmic paradigm ignores the sequential notion of these problems: use a historical dataset to predict future uncertainty and solve the resulting offline planning problem.
The Data-Driven Censored Newsvendor Problem
arXiv (Cornell University) · 2024-12-02
preprintOpen accessSenior authorWe study a censored variant of the data-driven newsvendor problem, where the decision-maker must select an ordering quantity that minimizes expected overage and underage costs based only on offline censored sales data, rather than historical demand realizations. Our goal is to understand how the degree of historical demand censoring affects the performance of any learning algorithm for this problem. To isolate this impact, we adopt a distributionally robust optimization framework, evaluating policies according to their worst-case regret over an ambiguity set of distributions. This set is defined by the largest historical order quantity (the observable boundary of the dataset), and contains all distributions matching the true demand distribution up to this boundary, while allowing them to be arbitrary afterwards. We demonstrate a spectrum of achievability under demand censoring by deriving a natural necessary and sufficient condition under which vanishing regret is an achievable goal. In regimes in which it is not, we exactly characterize the information loss due to censoring: an insurmountable lower bound on the performance of any policy, even when the decision-maker has access to infinitely many demand samples. We then leverage these sharp characterizations to propose a natural robust algorithm that adapts to the historical level of demand censoring. We derive finite-sample guarantees for this algorithm across all possible censoring regimes and show its near-optimality with matching lower bounds (up to polylogarithmic factors). We moreover demonstrate its robust performance via extensive numerical experiments on both synthetic and real-world datasets.
Multi-Objective LQR with Linear Scalarization
arXiv (Cornell University) · 2024-08-08
preprintOpen accessSenior authorThe framework of decision-making, modeled as a Markov Decision Process (MDP), typically assumes a single objective. However, practical scenarios often involve tradeoffs between multiple objectives. We address this in the Linear Quadratic Regulator (LQR), a canonical continuous, infinite horizon MDP. First, we establish that the Pareto front for LQR is characterized by linear scalarization: a convex combination of objectives recovers all tradeoff points, making multi-objective LQR reducible to single-objective problems. This highlights an important instance where linear scalarization suffices for a non-convex problem. Second, we show the Pareto front is smooth, in that an $ε$ perturbation of a scalarization parameter yields an $ε$ approximation to the objective. These results inspire a simple algorithm to approximate the Pareto front via grid search over scalarization parameters, where each optimization problem retains the computational efficiency of single-objective LQR. Lastly, we extend the analysis to certainty equivalence, where unknown dynamics are replaced with estimates.
Online Fair Allocation of Perishable Resources
arXiv (Cornell University) · 2024-06-04
preprintOpen accessSenior authorWe consider a practically motivated variant of the canonical online fair allocation problem: a decision-maker has a budget of perishable resources to allocate over a fixed number of rounds. Each round sees a random number of arrivals, and the decision-maker must commit to an allocation for these individuals before moving on to the next round. The goal is to construct a sequence of allocations that is envy-free and efficient. Our work makes two important contributions toward this problem: we first derive strong lower bounds on the optimal envy-efficiency trade-off, demonstrating that a decision-maker is fundamentally limited in what she can hope to achieve relative to the no-perishing setting; we then design an algorithm achieving these lower bounds which takes as input (i) a prediction of the perishing order, and (ii) a desired bound on envy. Given the remaining budget in each period, the algorithm uses forecasts of future demand perishing to adaptively choose from one of two carefully constructed guardrail quantities. We demonstrate our algorithm's strong numerical performance, and state-of-the-art, perishing-agnostic algorithms' inefficacy, on simulations calibrated to a real-world dataset.
Exploiting Exogenous Structure for Sample-Efficient Reinforcement Learning
arXiv (Cornell University) · 2024-09-22
preprintOpen accessWe study Exo-MDPs, a structured class of Markov Decision Processes (MDPs) where the state space is partitioned into exogenous and endogenous components. Exogenous states evolve stochastically, independent of the agent's actions, while endogenous states evolve deterministically based on both state components and actions. Exo-MDPs are useful for applications including inventory control, portfolio management, and ride-sharing. Our first result is structural, establishing a representational equivalence between the classes of discrete MDPs, Exo-MDPs, and discrete linear mixture MDPs. Specifically, any discrete MDP can be represented as an Exo-MDP, and the transition and reward dynamics can be written as linear functions of the exogenous state distribution, showing that Exo-MDPs are instances of linear mixture MDPs. For unobserved exogenous states, we prove a regret upper bound of $O(H^{3/2}d\sqrt{K})$ over $K$ trajectories of horizon $H$, with $d$ as the size of the exogenous state space, and establish nearly-matching lower bounds. Our findings demonstrate how Exo-MDPs decouple sample complexity from action and endogenous state sizes, and we validate our theoretical insights with experiments on inventory control.
Online Fair Allocation of Perishable Resources
2023-06-13 · 1 citations
articleSenior authorWe consider a practically motivated variant of the canonical online fair allocation problem: a decision-maker has a budget of resources to allocate over a fixed number of rounds. Each round sees a random number of arrivals, and the decision-maker must commit to an allocation for these individuals before moving on to the next round. In contrast to prior work, we consider a setting in which resources are perishable and individuals' utilities are potentially non-linear (e.g., goods exhibit complementarities). The goal is to construct a sequence of allocations that is envy-free and efficient. We design an algorithm that takes as input (i) a prediction of the perishing order, and (ii) a desired bound on envy. Given the remaining budget in each period, the algorithm uses forecasts of future demand and perishing to adaptively choose one of two carefully constructed guardrail quantities. We characterize conditions under which our algorithm achieves the optimal envy-efficiency Pareto frontier. We moreover demonstrate its strong numerical performance using data from a partnering food bank.
Online Fair Allocation of Perishable Resources
ACM SIGMETRICS Performance Evaluation Review · 2023-06-26 · 7 citations
articleSenior authorWe consider a practically motivated variant of the canonical online fair allocation problem: a decision-maker has a budget of resources to allocate over a fixed number of rounds. Each round sees a random number of arrivals, and the decision-maker must commit to an allocation for these individuals before moving on to the next round. In contrast to prior work, we consider a setting in which resources are perishable and individuals' utilities are potentially non-linear (e.g., goods exhibit complementarities). The goal is to construct a sequence of allocations that is envy-free and efficient. We design an algorithm that takes as input (i) a prediction of the perishing order, and (ii) a desired bound on envy. Given the remaining budget in each period, the algorithm uses forecasts of future demand and perishing to adaptively choose one of two carefully constructed guardrail quantities. We characterize conditions under which our algorithm achieves the optimal envy-efficiency Pareto frontier. We moreover demonstrate its strong numerical performance using data from a partnering food bank.
ACM SIGMETRICS Performance Evaluation Review · 2022-01-17 · 3 citations
articleReinforcement learning (RL) has received widespread attention across multiple communities, but the experiments have focused primarily on large-scale game playing and robotics tasks. In this paper we introduce ORSuite, an open-source library containing environments, algorithms, and instrumentation for operational problems. Our package is designed to motivate researchers in the reinforcement learning community to develop and evaluate algorithms on operational tasks, and to consider the true multi-objective nature of these problems by considering metrics beyond cumulative reward.
Frequent coauthors
- 19 shared
Siddhartha Banerjee
Cornell University
- 16 shared
Christina Lee Yu
- 6 shared
Gauri Jain
Cornell University
- 4 shared
Chamsi Hssaine
University of Southern California
- 3 shared
Tianyu Wang
- 2 shared
Devavrat Shah
- 2 shared
Gabriel P. Langlois
- 2 shared
Morgan Craig
Université de Montréal
Education
- 2015
BSc. Honours Mathematics and Computer Science, Mathematics
McGill University
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Sean Sinclair
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup