Aaron Roth
· ProfessorVerifiedUniversity of Pennsylvania · Computer and Information Science
Active 1999–2025
About
Aaron Roth is a professor whose research focuses on privacy, fairness, and security in computer science. His work involves developing algorithms and systems that ensure data privacy and equitable treatment in computational processes. Roth has contributed to the understanding of how to design algorithms that respect user privacy while maintaining utility, and he has worked on issues related to algorithmic fairness and security. His background includes extensive research in theoretical computer science, with a particular emphasis on the intersection of privacy and machine learning. Roth's key contributions include advancing the theoretical foundations of differential privacy and exploring its applications in real-world systems. His research aims to create practical solutions that balance data utility with privacy guarantees, and to address ethical concerns related to algorithmic decision-making.
Research signals
Five dimensions sourced from public faculty / publication signals. Sign in to compare against your own profile and see your match score.
Research topics
- Computer Science
- Artificial Intelligence
- Machine Learning
- Sociology
- Algorithm
- Data Mining
- Political Science
- Operating system
- Pedagogy
- Business
- Psychology
- Economics
- Actuarial science
- Theoretical computer science
- Medicine
- Mathematics
- Demography
- Programming language
- Mathematics education
- Econometrics
- Database
Selected publications
Stronger Neyman Regret Guarantees for Adaptive Experimental Design
ArXiv.org · 2025-02-24
preprintOpen accessSenior authorWe study the design of adaptive, sequential experiments for unbiased average treatment effect (ATE) estimation in the design-based potential outcomes setting. Our goal is to develop adaptive designs offering sublinear Neyman regret, meaning their efficiency must approach that of the hindsight-optimal nonadaptive design. Recent work [Dai et al, 2023] introduced ClipOGD, the first method achieving $\widetilde{O}(\sqrt{T})$ expected Neyman regret under mild conditions. In this work, we propose adaptive designs with substantially stronger Neyman regret guarantees. In particular, we modify ClipOGD to obtain anytime $\widetilde{O}(\log T)$ Neyman regret under natural boundedness assumptions. Further, in the setting where experimental units have pre-treatment covariates, we introduce and study a class of contextual "multigroup" Neyman regret guarantees: Given any set of possibly overlapping groups based on the covariates, the adaptive design outperforms each group's best non-adaptive designs. In particular, we develop a contextual adaptive design with $\widetilde{O}(\sqrt{T})$ anytime multigroup Neyman regret. We empirically validate the proposed designs through an array of experiments.
Sample Efficient Omniprediction and Downstream Swap Regret for Non-Linear Losses
ArXiv.org · 2025-02-18
preprintOpen accessWe define "decision swap regret" which generalizes both prediction for downstream swap regret and omniprediction, and give algorithms for obtaining it for arbitrary multi-dimensional Lipschitz loss functions in online adversarial settings. We also give sample complexity bounds in the batch setting via an online-to-batch reduction. When applied to omniprediction, our algorithm gives the first polynomial sample-complexity bounds for Lipschitz loss functions -- prior bounds either applied only to linear loss (or binary outcomes) or scaled exponentially with the error parameter even under the assumption that the loss functions were convex. When applied to prediction for downstream regret, we give the first algorithm capable of guaranteeing swap regret bounds for all downstream agents with non-linear loss functions over a multi-dimensional outcome space: prior work applied only to linear loss functions, modeling risk neutral agents. Our general bounds scale exponentially with the dimension of the outcome space, but we give improved regret and sample complexity bounds for specific families of multidimensional functions of economic interest: constant elasticity of substitution (CES), Cobb-Douglas, and Leontief utility functions.
The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review
Journal of the American Statistical Association · 2025-06-02 · 2 citations
articleNetworked Information Aggregation via Machine Learning
ArXiv.org · 2025-07-13
preprintOpen accessWe study a distributed learning problem in which learning agents are embedded in a directed acyclic graph (DAG). There is a fixed and arbitrary distribution over feature/label pairs, and each agent or vertex in the graph is able to directly observe only a subset of the features -- potentially a different subset for every agent. The agents learn sequentially in some order consistent with a topological sort of the DAG, committing to a model mapping observations to predictions of the real-valued label. Each agent observes the predictions of their parents in the DAG, and trains their model using both the features of the instance that they directly observe, and the predictions of their parents as additional features. We ask when this process is sufficient to achieve \emph{information aggregation}, in the sense that some agent in the DAG is able to learn a model whose error is competitive with the best model that could have been learned (in some hypothesis class) with direct access to \emph{all} features, despite the fact that no single agent in the network has such access. We give upper and lower bounds for this problem for both linear and general hypothesis classes. Our results identify the \emph{depth} of the DAG as the key parameter: information aggregation can occur over sufficiently long paths in the DAG, assuming that all of the relevant features are well represented along the path, and there are distributions over which information aggregation cannot occur even in the linear case, and even in arbitrarily large DAGs that do not have sufficient depth (such as a hub-and-spokes topology in which the spoke vertices collectively see all the features). We complement our theoretical results with a comprehensive set of experiments.
ArXiv.org · 2025-02-04
preprintOpen accessA fundamental question in data-driven decision making is how to quantify the uncertainty of predictions in ways that can usefully inform downstream action. This interface between prediction uncertainty and decision-making is especially important in risk-sensitive domains, such as medicine. In this paper, we develop decision-theoretic foundations that connect uncertainty quantification using prediction sets with risk-averse decision-making. Specifically, we answer three fundamental questions: (1) What is the correct notion of uncertainty quantification for risk-averse decision makers? We prove that prediction sets are optimal for decision makers who wish to optimize their value at risk. (2) What is the optimal policy that a risk averse decision maker should use to map prediction sets to actions? We show that a simple max-min decision policy is optimal for risk-averse decision makers. Finally, (3) How can we derive prediction sets that are optimal for such decision makers? We provide an exact characterization in the population regime and a distribution free finite-sample construction. Answering these questions naturally leads to an algorithm, Risk-Averse Calibration (RAC), which follows a provably optimal design for deriving action policies from predictions. RAC is designed to be both practical-capable of leveraging the quality of predictions in a black-box manner to enhance downstream utility-and safe-adhering to a user-defined risk threshold and optimizing the corresponding risk quantile of the user's downstream utility. Finally, we experimentally demonstrate the significant advantages of RAC in applications such as medical diagnosis and recommendation systems. Specifically, we show that RAC achieves a substantially improved trade-off between safety and utility, offering higher utility compared to existing methods while maintaining the safety guarantee.
Resolving the Reference Class Problem at Scale
Philosophy of Science · 2025-04-14 · 1 citations
articleOpen access1st authorCorrespondingAbstract We draw a distinction between the traditional reference class problem, which describes an obstruction to estimating a single individual probability—which we rename the individual reference class problem —and what we call the reference class problem at scale , which can result when using tools from statistics and machine learning to systematically make predictions about many individual probabilities simultaneously. We argue that scale actually helps to mitigate the reference class problem, and purely statistical tools can be used to efficiently minimize the reference class problem at scale, even though they cannot be used to solve the individual reference class problem.
Intersectional Fairness in Reinforcement Learning with Large State and Constraint Spaces
ArXiv.org · 2025-02-17
preprintOpen accessIn traditional reinforcement learning (RL), the learner aims to solve a single objective optimization problem: find the policy that maximizes expected reward. However, in many real-world settings, it is important to optimize over multiple objectives simultaneously. For example, when we are interested in fairness, states might have feature annotations corresponding to multiple (intersecting) demographic groups to whom reward accrues, and our goal might be to maximize the reward of the group receiving the minimal reward. In this work, we consider a multi-objective optimization problem in which each objective is defined by a state-based reweighting of a single scalar reward function. This generalizes the problem of maximizing the reward of the minimum reward group. We provide oracle-efficient algorithms to solve these multi-objective RL problems even when the number of objectives is exponentially large-for tabular MDPs, as well as for large MDPs when the group functions have additional structure. Finally, we experimentally validate our theoretical results and demonstrate applications on a preferential attachment graph MDP.
The Value of Ambiguous Commitments in Multi-Follower Games
SSRN Electronic Journal · 2025-01-01
preprintOpen accessSenior authorReplicable Reinforcement Learning with Linear Function Approximation
ArXiv.org · 2025-09-10
preprintOpen accessReplication of experimental results has been a challenge faced by many scientific disciplines, including the field of machine learning. Recent work on the theory of machine learning has formalized replicability as the demand that an algorithm produce identical outcomes when executed twice on different samples from the same distribution. Provably replicable algorithms are especially interesting for reinforcement learning (RL), where algorithms are known to be unstable in practice. While replicable algorithms exist for tabular RL settings, extending these guarantees to more practical function approximation settings has remained an open problem. In this work, we make progress by developing replicable methods for linear function approximation in RL. We first introduce two efficient algorithms for replicable random design regression and uncentered covariance estimation, each of independent interest. We then leverage these tools to provide the first provably efficient replicable RL algorithms for linear Markov decision processes in both the generative model and episodic settings. Finally, we evaluate our algorithms experimentally and show how they can inspire more consistent neural policies.
Collaborative Prediction: Tractable Information Aggregation via Agreement
ArXiv.org · 2025-04-08
preprintOpen accessWe give efficient "collaboration protocols" through which two parties, who observe different features about the same instances, can interact to arrive at predictions that are more accurate than either could have obtained on their own. The parties only need to iteratively share and update their own label predictions-without either party ever having to share the actual features that they observe. Our protocols are efficient reductions to the problem of learning on each party's feature space alone, and so can be used even in settings in which each party's feature space is illegible to the other-which arises in models of human/AI interaction and in multi-modal learning. The communication requirements of our protocols are independent of the dimensionality of the data. In an online adversarial setting we show how to give regret bounds on the predictions that the parties arrive at with respect to a class of benchmark policies defined on the joint feature space of the two parties, despite the fact that neither party has access to this joint feature space. We also give simpler algorithms for the same task in the batch setting in which we assume that there is a fixed but unknown data distribution. We generalize our protocols to a decision theoretic setting with high dimensional outcome spaces, where parties communicate only "best response actions." Our theorems give a computationally and statistically tractable generalization of past work on information aggregation amongst Bayesians who share a common and correct prior, as part of a literature studying "agreement" in the style of Aumann's agreement theorem. Our results require no knowledge of (or even the existence of) a prior distribution and are computationally efficient. Nevertheless we show how to lift our theorems back to this classical Bayesian setting, and in doing so, give new information aggregation theorems for Bayesian agreement.
Recent grants
CAREER: Correctness-Performance Partitioned (CPP) Architectures
NSF · $400k · 2003–2009
FAI: Breaking the Tradeoff Barrier in Algorithmic Fairness
NSF · $393k · 2022–2025
CAREER: The Algorithmic Foundations of Data Privacy
NSF · $484k · 2013–2020
ICES: Large: Economic Foundations of Digital Privacy
NSF · $998k · 2011–2016
TWC: Medium: Distributed Differential Privacy
NSF · $1.2M · 2015–2021
Frequent coauthors
- 108 shared
Michael Kearns
- 69 shared
Zhiwei Steven Wu
- 44 shared
Seth Neel
- 41 shared
Jonathan Ullman
Northeastern University
- 38 shared
Katrina Ligett
Hebrew University of Jerusalem
- 30 shared
Mallesh M. Pai
- 30 shared
Jamie Morgenstern
University of Washington
- 29 shared
Sampath Kannan
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Aaron Roth
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup