Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Adityanand Guntuboyina

Adityanand Guntuboyina

Verified

University of California, Berkeley · Department of Statistics

Active 2008–2026

h-index17
Citations1.2k
Papers8121 last 5y
Funding$658k
See your match with Adityanand Guntuboyina — sign in to PhdFit.Sign in

About

Adityanand Guntuboyina is a professor and Deputy Chair in the Department of Statistics at the University of California, Berkeley. His research interests include nonparametric and high-dimensional statistics, shape constrained statistical estimation, empirical processes, and statistical information theory. He is engaged in advancing methodologies related to nonparametric estimation, shape-constrained estimation, and high-dimensional data analysis, contributing to the theoretical foundations and applications within these areas.

Research signals

Five dimensions sourced from public faculty / publication signals. Sign in to compare against your own profile and see your match score.

Research topics

  • Mathematics
  • Applied mathematics
  • Combinatorics
  • Statistics
  • Mathematical optimization

Selected publications

  • What Functions Does XGBoost Learn?

    arXiv (Cornell University) · 2026-01-09

    preprintOpen accessSenior author

    This paper establishes a rigorous theoretical foundation for the function class implicitly learned by XGBoost, bridging the gap between its empirical success and our theoretical understanding. We introduce an infinite-dimensional function class $\mathcal{F}^{d, s}_{\infty-\text{ST}}$ that extends finite ensembles of bounded-depth regression trees, together with a complexity measure $V^{d, s}_{\infty-\text{XGB}}(\cdot)$ that generalizes the $L^1$ regularization penalty used in XGBoost. We show that every optimizer of the XGBoost objective is also an optimizer of an equivalent penalized regression problem over $\mathcal{F}^{d, s}_{\infty-\text{ST}}$ with penalty $V^{d, s}_{\infty-\text{XGB}}(\cdot)$, providing an interpretation of XGBoost as implicitly targeting a broader function class. We also develop a smoothness-based interpretation of $\mathcal{F}^{d, s}_{\infty-\text{ST}}$ and $V^{d, s}_{\infty-\text{XGB}}(\cdot)$ in terms of Hardy--Krause variation. We prove that the least squares estimator over $\{f \in \mathcal{F}^{d, s}_{\infty-\text{ST}}: V^{d, s}_{\infty-\text{XGB}}(f) \le V\}$ achieves a nearly minimax-optimal rate of convergence $n^{-2/3} (\log n)^{4(\min(s, d) - 1)/3}$, thereby avoiding the curse of dimensionality. Our results provide the first rigorous characterization of the function space underlying XGBoost, clarify its connection to classical notions of variation, and identify an important open problem: whether the XGBoost algorithm itself achieves minimax optimality over this class.

  • Totally Concave Regression

    Journal of the American Statistical Association · 2026-01-30

    articleSenior authorCorresponding
  • Totally Concave Regression

    Figshare · 2026-01-30

    datasetOpen accessSenior author

    Shape constraints in nonparametric regression provide a powerful framework for estimating regression functions under realistic assumptions without tuning parameters. However, most existing methods—except additive models—impose too weak restrictions, often leading to overfitting in high dimensions. Conversely, additive models can be too rigid, failing to capture covariate interactions. This article introduces a novel multivariate shape-constrained regression approach based on total concavity, originally studied by T. Popoviciu. Our method allows interactions while mitigating the curse of dimensionality, with convergence rates that depend only logarithmically on the number of covariates. We characterize and compute the least squares estimator over totally concave functions, derive theoretical guarantees, and demonstrate its practical effectiveness through empirical studies on real-world datasets. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.

  • Totally Concave Regression

    Figshare · 2026-01-30

    datasetOpen accessSenior author

    Shape constraints in nonparametric regression provide a powerful framework for estimating regression functions under realistic assumptions without tuning parameters. However, most existing methods—except additive models—impose too weak restrictions, often leading to overfitting in high dimensions. Conversely, additive models can be too rigid, failing to capture covariate interactions. This article introduces a novel multivariate shape-constrained regression approach based on total concavity, originally studied by T. Popoviciu. Our method allows interactions while mitigating the curse of dimensionality, with convergence rates that depend only logarithmically on the number of covariates. We characterize and compute the least squares estimator over totally concave functions, derive theoretical guarantees, and demonstrate its practical effectiveness through empirical studies on real-world datasets. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.

  • Gaussian mixtures and non-parametric likelihoods through the lens of statistical mechanics

    ArXiv.org · 2026-03-24

    articleOpen access

    In this work, we investigate Gaussian Mixture Models ({\it abbrv} GMM) and the related problem of non parametric maximum likelihood estimation ({\it abbrv} NPMLE) from the perspective of statistical mechanics. In particular, we establish stability guarantees for the NPMLE procedure that extend well beyond the state of the art. Crucially, we obtain guarantees on the Kullback-Leibler divergence between NPMLE estimators and the ground truth, a type of result which has been known to be challenging in the literature on this problem. In particular, we provide high probability upper bounds on the KL divergence between the NPMLE and the true density that are of the order of $\min\big\{\frac{(\log n)^{d+2}}{n} , \frac{\log n}{\sqrt n}\big\}$, which cover a wide range of scenarios for the comparative sizes of $n$ and $d$. We obtain similar guarantees for approximate solutions to the NPMLE problem, addressing realistic situations wherein optimization algorithms need to be stopped in finite time, allowing access only to approximations to the true NPMLE. A technical cornerstone of our approach is an analysis of the function class complexity of logarithms of gaussian mixture densities, which is able to handle their unboundedness, and could be of wider interest. We also establish correspondences between stability phenomena in the NPMLE problem and concepts from chaos and multiple valleys in random energy landscapes of statistical mechanics models. We believe that these correspondences may be useful for a wide variety of random optimization problems in statistics and machine learning, especially the connections to the the technical ingredients of concentration phenomena and Langevin dynamics for these models.

  • Gaussian mixtures and non-parametric likelihoods through the lens of statistical mechanics

    arXiv (Cornell University) · 2026-03-24

    preprintOpen access

    In this work, we investigate Gaussian Mixture Models ({\it abbrv} GMM) and the related problem of non parametric maximum likelihood estimation ({\it abbrv} NPMLE) from the perspective of statistical mechanics. In particular, we establish stability guarantees for the NPMLE procedure that extend well beyond the state of the art. Crucially, we obtain guarantees on the Kullback-Leibler divergence between NPMLE estimators and the ground truth, a type of result which has been known to be challenging in the literature on this problem. In particular, we provide high probability upper bounds on the KL divergence between the NPMLE and the true density that are of the order of $\min\big\{\frac{(\log n)^{d+2}}{n} , \frac{\log n}{\sqrt n}\big\}$, which cover a wide range of scenarios for the comparative sizes of $n$ and $d$. We obtain similar guarantees for approximate solutions to the NPMLE problem, addressing realistic situations wherein optimization algorithms need to be stopped in finite time, allowing access only to approximations to the true NPMLE. A technical cornerstone of our approach is an analysis of the function class complexity of logarithms of gaussian mixture densities, which is able to handle their unboundedness, and could be of wider interest. We also establish correspondences between stability phenomena in the NPMLE problem and concepts from chaos and multiple valleys in random energy landscapes of statistical mechanics models. We believe that these correspondences may be useful for a wide variety of random optimization problems in statistics and machine learning, especially the connections to the the technical ingredients of concentration phenomena and Langevin dynamics for these models.

  • What Functions Does XGBoost Learn?

    ArXiv.org · 2026-01-09

    articleOpen accessSenior author

    This paper establishes a rigorous theoretical foundation for the function class implicitly learned by XGBoost, bridging the gap between its empirical success and our theoretical understanding. We introduce an infinite-dimensional function class $\mathcal{F}^{d, s}_{\infty-\text{ST}}$ that extends finite ensembles of bounded-depth regression trees, together with a complexity measure $V^{d, s}_{\infty-\text{XGB}}(\cdot)$ that generalizes the $L^1$ regularization penalty used in XGBoost. We show that every optimizer of the XGBoost objective is also an optimizer of an equivalent penalized regression problem over $\mathcal{F}^{d, s}_{\infty-\text{ST}}$ with penalty $V^{d, s}_{\infty-\text{XGB}}(\cdot)$, providing an interpretation of XGBoost as implicitly targeting a broader function class. We also develop a smoothness-based interpretation of $\mathcal{F}^{d, s}_{\infty-\text{ST}}$ and $V^{d, s}_{\infty-\text{XGB}}(\cdot)$ in terms of Hardy--Krause variation. We prove that the least squares estimator over $\{f \in \mathcal{F}^{d, s}_{\infty-\text{ST}}: V^{d, s}_{\infty-\text{XGB}}(f) \le V\}$ achieves a nearly minimax-optimal rate of convergence $n^{-2/3} (\log n)^{4(\min(s, d) - 1)/3}$, thereby avoiding the curse of dimensionality. Our results provide the first rigorous characterization of the function space underlying XGBoost, clarify its connection to classical notions of variation, and identify an important open problem: whether the XGBoost algorithm itself achieves minimax optimality over this class.

  • Totally Concave Regression

    arXiv (Cornell University) · 2025-01-08

    preprintOpen accessSenior author

    Shape constraints in nonparametric regression provide a powerful framework for estimating regression functions under realistic assumptions without tuning parameters. However, most existing methods$\unicode{x2013}$except additive models$\unicode{x2013}$impose too weak restrictions, often leading to overfitting in high dimensions. Conversely, additive models can be too rigid, failing to capture covariate interactions. This paper introduces a novel multivariate shape-constrained regression approach based on total concavity, originally studied by T. Popoviciu. Our method allows interactions while mitigating the curse of dimensionality, with convergence rates that depend only logarithmically on the number of covariates. We characterize and compute the least squares estimator over totally concave functions, derive theoretical guarantees, and demonstrate its practical effectiveness through empirical studies on real-world datasets.

  • Convergence rates for estimating multivariate scale mixtures of uniform densities

    Electronic Journal of Statistics · 2025-01-01

    articleOpen accessSenior author

    The Grenander estimator is a well-studied procedure for univariate nonparametric density estimation. It is usually defined as the Maximum Likelihood Estimator (MLE) over the class of all non-increasing densities on the positive real line. It can also be seen as the MLE over the class of all scale mixtures of uniform densities. Using the latter viewpoint, Pavlides and Wellner [33] proposed a multivariate extension of the Grenander estimator as the nonparametric MLE over the class of all multivariate scale mixtures of uniform densities. We prove that this multivariate estimator achieves the univariate cube root rate of convergence with only a logarithmic multiplicative factor that depends on the dimension. The usual curse of dimensionality is therefore avoided to some extent for this multivariate estimator. This result positively resolves a conjecture of Pavlides and Wellner [33] under an additional lower bound assumption. Our proof proceeds via a general accuracy result for the Hellinger accuracy of MLEs over convex classes of densities. We also provide algorithms for computing the estimator, and illustrate performance on real and simulated datasets.

  • MARS via LASSO

    The Annals of Statistics · 2024-06-01 · 2 citations

    articleSenior author

    Multivariate adaptive regression splines (MARS) is a popular method for nonparametric regression introduced by Friedman in 1991. MARS fits simple nonlinear and non-additive functions to regression data. We propose and study a natural lasso variant of the MARS method. Our method is based on least squares estimation over a convex class of functions obtained by considering infinite-dimensional linear combinations of functions in the MARS basis and imposing a variation based complexity constraint. Our estimator can be computed via finite-dimensional convex optimization, although it is defined as a solution to an infinite-dimensional optimization problem. Under a few standard design assumptions, we prove that our estimator achieves a rate of convergence that depends only logarithmically on dimension and thus avoids the usual curse of dimensionality to some extent. We also show that our method is naturally connected to nonparametric estimation techniques based on smoothness constraints. We implement our method with a cross-validation scheme for the selection of the involved tuning parameter and compare it to the usual MARS method in various simulation and real data settings.

Recent grants

Frequent coauthors

  • Bodhisattva Sen

    24 shared
  • Xi Chen

    Second Xiangya Hospital of Central South University

    11 shared
  • Arlene K. H. Kim

    Korea University

    11 shared
  • Yuchen Zhang

    Zhejiang University of Technology

    10 shared
  • Sujayam Saha

    Google (United States)

    8 shared
  • Sabyasachi Chatterjee

    8 shared
  • Billy Fang

    Google (United States)

    7 shared
  • Martin J. Wainwright

    Massachusetts Institute of Technology

    6 shared
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Adityanand Guntuboyina

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup