Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Joshua S Agterberg

Joshua S Agterberg

· Assistant ProfessorVerified

University of Illinois Urbana-Champaign · Statistics

Active 2019–2026

h-index5
Citations76
Papers3026 last 5y
Funding
See your match with Joshua S Agterberg — sign in to PhdFit.Sign in

About

Joshua S. Agterberg is an assistant professor in the Department of Statistics at the University of Illinois Urbana-Champaign. His research broadly focuses on the analysis of algorithms and statistical models for networks and structured matrix and tensor data. He completed his PhD in applied mathematics and statistics at Johns Hopkins University in February 2023 under the supervision of Professor Carey Priebe. Prior to his current position, he spent one year as a postdoctoral researcher at the University of Pennsylvania working with René Vidal and Yuxin Chen. Joshua graduated from the University of Wisconsin-Madison in 2017 with a Bachelor of Business Administration in actuarial science and mathematics, where he was advised by Margie Rosenberg. He is also a member of Math Alliance. Outside of academia, he enjoys trail and ultra running and lifting weights.

Research topics

  • Computer Science
  • Combinatorics
  • Mathematics
  • Pure mathematics
  • Applied mathematics
  • Statistics

Selected publications

  • Joshua Agterberg’s Discussion of ‘Statistical exploration of the manifold hypothesis’ by Whiteley et al.

    Journal of the Royal Statistical Society Series B (Statistical Methodology) · 2026-01-14

    article1st authorCorresponding
  • An Overview of Asymptotic Normality in Stochastic Blockmodels: Cluster Analysis and Inference

    Statistical Science · 2026-04-08 · 1 citations

    preprintOpen access1st authorCorresponding

    This paper provides a selective review of the statistical network analysis literature focused on clustering and inference problems for stochastic blockmodels and their variants. We survey asymptotic normality results for stochastic blockmodels as a means of thematically linking classical statistical concepts to contemporary research in network data analysis. Multiple different forms of asymptotically Gaussian behavior arise in stochastic blockmodels and are useful for different purposes, pertaining to estimation and testing, the characterization of cluster structure in community detection and understanding latent space geometry. This paper concludes with a discussion of open problems and ongoing research activities addressing asymptotic normality and its implications for statistical network modeling.

  • A High-Dimensional Statistical Theory for Convex and Nonconvex Matrix Sensing

    ArXiv.org · 2025-06-25

    preprintOpen access1st authorCorresponding

    The problem of matrix sensing, or trace regression, is a problem wherein one wishes to estimate a low-rank matrix from linear measurements perturbed with noise. A number of existing works have studied both convex and nonconvex approaches to this problem, establishing minimax error rates when the number of measurements is sufficiently large relative to the rank and dimension of the low-rank matrix, though a precise comparison of these procedures still remains unexplored. In this work we provide a high-dimensional statistical analysis for symmetric low-rank matrix sensing observed under Gaussian measurements and noise. Our main result describes a novel phenomenon: in this statistical model and in an appropriate asymptotic regime, the behavior of any local minimum of the nonconvex factorized approach (with known rank) is approximately equivalent to that of the matrix hard-thresholding of a corresponding matrix denoising problem, and the behavior of the convex nuclear-norm regularized least squares approach is approximately equivalent to that of matrix soft-thresholding of the same matrix denoising problem. Here "approximately equivalent" is understood in the sense of concentration of Lipchitz functions. As a consequence, the nonconvex procedure uniformly dominates the convex approach in mean squared error. Our arguments are based on a matrix operator generalization of the Convex Gaussian Min-Max Theorem (CGMT) together with studying the interplay between local minima of the convex and nonconvex formulations and their "debiased" counterparts, and several of these results may be of independent interest.

  • Joint Spectral Clustering in Multilayer Degree-Corrected Stochastic Blockmodels

    Journal of the American Statistical Association · 2025-06-09 · 5 citations

    article1st author
  • Concentration bounds on response-based vector embeddings of black-box generative models

    ArXiv.org · 2025-11-11

    preprintOpen access

    Generative models, such as large language models or text-to-image diffusion models, can generate relevant responses to user-given queries. Response-based vector embeddings of generative models facilitate statistical analysis and inference on a given collection of black-box generative models. The Data Kernel Perspective Space embedding is one particular method of obtaining response-based vector embeddings for a given set of generative models, already discussed in the literature. In this paper, under appropriate regularity conditions, we establish high probability concentration bounds on the sample vector embeddings for a given set of generative models, obtained through the method of Data Kernel Perspective Space embedding. Our results tell us the required number of sample responses needed in order to approximate the population-level vector embeddings with a desired level of accuracy. The algebraic tools used to establish our results can be used further for establishing concentration bounds on Classical Multidimensional Scaling embeddings in general, when the dissimilarities are observed with noise.

  • Nonconvex Linear System Identification with Minimal State Representation

    ArXiv.org · 2025-04-26

    preprintOpen access

    Low-order linear System IDentification (SysID) addresses the challenge of estimating the parameters of a linear dynamical system from finite samples of observations and control inputs with minimal state representation. Traditional approaches often utilize Hankel-rank minimization, which relies on convex relaxations that can require numerous, costly singular value decompositions (SVDs) to optimize. In this work, we propose two nonconvex reformulations to tackle low-order SysID (i) Burer-Monterio (BM) factorization of the Hankel matrix for efficient nuclear norm minimization, and (ii) optimizing directly over system parameters for real, diagonalizable systems with an atomic norm style decomposition. These reformulations circumvent the need for repeated heavy SVD computations, significantly improving computational efficiency. Moreover, we prove that optimizing directly over the system parameters yields lower statistical error rates, and lower sample complexities that do not scale linearly with trajectory length like in Hankel-nuclear norm minimization. Additionally, while our proposed formulations are nonconvex, we provide theoretical guarantees of achieving global optimality in polynomial time. Finally, we demonstrate algorithms that solve these nonconvex programs and validate our theoretical claims on synthetic data.

  • Statistical Inference for Low-Rank Tensors: Heteroskedasticity, Subgaussianity, and Applications

    arXiv (Cornell University) · 2024-10-08

    preprintOpen access1st authorCorresponding

    In this paper, we consider inference and uncertainty quantification for low Tucker rank tensors with additive noise in the high-dimensional regime. Focusing on the output of the higher-order orthogonal iteration (HOOI) algorithm, a commonly used algorithm for tensor singular value decomposition, we establish non-asymptotic distributional theory and study how to construct confidence regions and intervals for both the estimated singular vectors and the tensor entries in the presence of heteroskedastic subgaussian noise, which are further shown to be optimal for homoskedastic Gaussian noise. Furthermore, as a byproduct of our theoretical results, we establish the entrywise convergence of HOOI when initialized via diagonal deletion. To further illustrate the utility of our theoretical results, we then consider several concrete statistical inference tasks. First, in the tensor mixed-membership blockmodel, we consider a two-sample test for equality of membership profiles, and we propose a test statistic with consistency under local alternatives that exhibits a power improvement relative to the corresponding matrix test considered in several previous works. Next, we consider simultaneous inference for small collections of entries of the tensor, and we obtain consistent confidence regions. Finally, focusing on the particular case of testing whether entries of the tensor are equal, we propose a consistent test statistic that shows how index overlap results in different asymptotic standard deviations. All of our proposed procedures are fully data-driven, adaptive to noise distribution and signal strength, and do not rely on sample-splitting, and our main results highlight the effect of higher-order structures on estimation relative to the matrix setting. Our theoretical results are demonstrated through numerical simulations.

  • A Convex Relaxation Approach to Generalization Analysis for Parallel Positively Homogeneous Networks

    arXiv (Cornell University) · 2024-11-05

    preprintOpen access

    We propose a general framework for deriving generalization bounds for parallel positively homogeneous neural networks--a class of neural networks whose input-output map decomposes as the sum of positively homogeneous maps. Examples of such networks include matrix factorization and sensing, single-layer multi-head attention mechanisms, tensor factorization, deep linear and ReLU networks, and more. Our general framework is based on linking the non-convex empirical risk minimization (ERM) problem to a closely related convex optimization problem over prediction functions, which provides a global, achievable lower-bound to the ERM problem. We exploit this convex lower-bound to perform generalization analysis in the convex space while controlling the discrepancy between the convex model and its non-convex counterpart. We apply our general framework to a wide variety of models ranging from low-rank matrix sensing, to structured matrix sensing, two-layer linear networks, two-layer ReLU networks, and single-layer multi-head attention mechanisms, achieving generalization bounds with a sample complexity that scales almost linearly with the network width.

  • Estimating Higher-Order Mixed Memberships via the l2,∞ Tensor Perturbation Bound

    Journal of the American Statistical Association · 2024-09-23 · 5 citations

    articleOpen access1st authorCorresponding

    tensor perturbation bound for HOOI under independent, heteroskedastic, subgaussian noise that may be of independent interest. Our analysis uses a novel leave-one-out construction for the iterates, and our bounds depend only on spectral properties of the underlying low-rank tensor under nearly optimal signal-to-noise ratio conditions such that tensor SVD is computationally feasible. Finally, we apply our methodology to real and simulated data, demonstrating some effects not identifiable from the model with discrete community memberships.

  • Correcting a nonparametric two-sample graph hypothesis test for graphs with different numbers of vertices with applications to connectomics

    Applied Network Science · 2024-01-03 · 4 citations

    articleOpen access

    Abstract Random graphs are statistical models that have many applications, ranging from neuroscience to social network analysis. Of particular interest in some applications is the problem of testing two random graphs for equality of generating distributions. Tang et al. (Bernoulli 23:1599–1630, 2017) propose a test for this setting. This test consists of embedding the graph into a low-dimensional space via the adjacency spectral embedding (ASE) and subsequently using a kernel two-sample test based on the maximum mean discrepancy. However, if the two graphs being compared have an unequal number of vertices, the test of Tang et al. (Bernoulli 23:1599–1630, 2017) may not be valid. We demonstrate the intuition behind this invalidity and propose a correction that makes any subsequent kernel- or distance-based test valid. Our method relies on sampling based on the asymptotic distribution for the ASE. We call these altered embeddings the corrected adjacency spectral embeddings (CASE) . We also show that CASE remedies the exchangeability problem of the original test and demonstrate the validity and consistency of the test that uses CASE via a simulation study. Lastly, we apply our proposed test to the problem of determining equivalence of generating distributions in human connectomes extracted from diffusion magnetic resonance imaging at different scales.

Frequent coauthors

Labs

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Joshua S Agterberg

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup