Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Zhou Fan

Zhou Fan

· Associate Professor of Statistics & Data ScienceVerified

Yale University · Department of Statistics and Data Science

Active 2001–2025

h-index14
Citations567
Papers10566 last 5y
Funding$583k1 active
See your match with Zhou Fan — sign in to PhdFit.Sign in

About

Zhou Fan is an Associate Professor in the Department of Statistics and Data Science at Yale University. His research interests lie at the intersection of mathematical statistics, probability theory, and computational algorithms. He focuses on inferential problems that arise in scientific applications, particularly in statistical genetics and computational biology. His work involves developing theoretical and computational methods to address complex statistical challenges in these fields.

Research topics

  • Computer Science
  • Algorithm
  • Mathematics
  • Sociology
  • Statistics
  • Theoretical computer science
  • Cartography
  • Combinatorics
  • Mathematical optimization
  • Engineering
  • World Wide Web
  • Applied mathematics
  • Civil engineering
  • Geography
  • Transport engineering

Selected publications

  • When does Gaussian equivalence fail and how to fix it: Non-universal behavior of random features with quadratic scaling

    ArXiv.org · 2025-12-03

    preprintOpen access

    A major effort in modern high-dimensional statistics has been devoted to the analysis of linear predictors trained on nonlinear feature embeddings via empirical risk minimization (ERM). Gaussian equivalence theory (GET) has emerged as a powerful universality principle in this context: it states that the behavior of high-dimensional, complex features can be captured by Gaussian surrogates, which are more amenable to analysis. Despite its remarkable successes, numerical experiments show that this equivalence can fail even for simple embeddings -- such as polynomial maps -- under general scaling regimes. We investigate this breakdown in the setting of random feature (RF) models in the quadratic scaling regime, where both the number of features and the sample size grow quadratically with the data dimension. We show that when the target function depends on a low-dimensional projection of the data, such as generalized linear models, GET yields incorrect predictions. To capture the correct asymptotics, we introduce a Conditional Gaussian Equivalent (CGE) model, which can be viewed as appending a low-dimensional non-Gaussian component to an otherwise high-dimensional Gaussian model. This hybrid model retains the tractability of the Gaussian framework and accurately describes RF models in the quadratic scaling regime. We derive sharp asymptotics for the training and test errors in this setting, which continue to agree with numerical simulations even when GET fails. Our analysis combines general results on CLT for Wiener chaos expansions and a careful two-phase Lindeberg swapping argument. Beyond RF models and quadratic scaling, our work hints at a rich landscape of universality phenomena in high-dimensional ERM.

  • Asymptotic mutual information in quadratic estimation problems over compact groups

    Information and Inference A Journal of the IMA · 2025-06-27 · 2 citations

    articleSenior author

    Abstract Motivated by applications to group synchronization and quadratic assignment on random data, we study a general problem of Bayesian inference of an unknown ‘signal’ belonging to a high-dimensional compact group, given noisy pairwise observations of a featurization of this signal. We establish a quantitative comparison between the signal-observation mutual information in any such problem with that in a simpler model with linear observations, using interpolation methods. For group synchronization, our result proves a replica formula for the asymptotic mutual information and Bayes-optimal mean-squared-error. Via analyses of this replica formula, we show that the conjectural phase transition threshold for computationally efficient weak recovery of the signal is determined by a classification of the real-irreducible components of the observed group representation(s), and we fully characterize the information-theoretic limits of estimation in the example of angular/phase synchronization over $\mathbb{SO}(2)$/$\mathbb{U}(1)$. For quadratic assignment, we study observations given by a kernel matrix of pairwise similarities and a randomly permutated and noisy counterpart, and we show in a bounded signal-to-noise regime that the asymptotic mutual information coincides with that in a Bayesian spiked model with i.i.d. signal prior.

  • On Universality of Non-Separable Approximate Message Passing Algorithms

    ArXiv.org · 2025-06-28

    preprintOpen accessSenior author

    Mean-field characterizations of first-order iterative algorithms -- including Approximate Message Passing (AMP), stochastic and proximal gradient descent, and Langevin diffusions -- have enabled a precise understanding of learning dynamics in many statistical applications. For algorithms whose non-linearities have a coordinate-separable form, it is known that such characterizations enjoy a degree of universality with respect to the underlying data distribution. However, mean-field characterizations of non-separable algorithm dynamics have largely remained restricted to i.i.d. Gaussian or rotationally-invariant data. In this work, we initiate a study of universality for non-separable AMP algorithms. We identify a general condition for AMP with polynomial non-linearities, in terms of a Bounded Composition Property (BCP) for their representing tensors, to admit a state evolution that holds universally for matrices with non-Gaussian entries. We then formalize a condition of BCP-approximability for Lipschitz AMP algorithms to enjoy a similar universal guarantee. We demonstrate that many common classes of non-separable non-linearities are BCP-approximable, including local denoisers, spectral denoisers for generic signals, and compositions of separable functions with generic linear maps, implying the universality of state evolution for AMP algorithms employing these non-linearities.

  • Approximate message passing for orthogonally invariant ensembles: multivariate non-linearities and spectral initialization

    Information and Inference A Journal of the IMA · 2024-07-01 · 8 citations

    articleOpen accessSenior author

    Abstract We study a class of Approximate Message Passing (AMP) algorithms for symmetric and rectangular spiked random matrix models with orthogonally invariant noise. The AMP iterates have fixed dimension $K \geq 1$, a multivariate non-linearity is applied in each AMP iteration, and the algorithm is spectrally initialized with $K$ super-critical sample eigenvectors. We derive the forms of the Onsager debiasing coefficients and corresponding AMP state evolution, which depend on the free cumulants of the noise spectral distribution. This extends previous results for such models with $K=1$ and an independent initialization. Applying this approach to Bayesian principal components analysis, we introduce a Bayes-OAMP algorithm that uses as its non-linearity the posterior mean conditional on all preceding AMP iterates. We describe a practical implementation of this algorithm, where all debiasing and state evolution parameters are estimated from the observed data, and we illustrate the accuracy and stability of this approach in simulations.

  • Rates of estimation for high-dimensional multireference alignment

    The Annals of Statistics · 2024-02-01 · 1 citations

    articleOpen access

    We study the continuous multireference alignment model of estimating a periodic function on the circle from noisy and circularly-rotated observations. Motivated by analogous high-dimensional problems that arise in cryo-electron microscopy, we establish minimax rates for estimating generic signals that are explicit in the dimension K. In a high-noise regime with noise variance σ2≳K, for signals with Fourier coefficients of roughly uniform magnitude, the rate scales as σ6 and has no further dependence on the dimension. This rate is achieved by a bispectrum inversion procedure, and our analyses provide new stability bounds for bispectrum inversion that may be of independent interest. In a low-noise regime where σ2≲K/logK, the rate scales instead as Kσ2, and we establish this rate by a sharp analysis of the maximum likelihood estimator that marginalizes over latent rotations. A complementary lower bound that interpolates between these two regimes is obtained using Assouad’s hypercube lemma. We extend these analyses also to signals whose Fourier coefficients have a slow power law decay.

  • Asymptotic mutual information in quadratic estimation problems over compact groups

    arXiv (Cornell University) · 2024-04-15

    preprintOpen accessSenior author

    Motivated by applications to group synchronization and quadratic assignment on random data, we study a general problem of Bayesian inference of an unknown ``signal'' belonging to a high-dimensional compact group, given noisy pairwise observations of a featurization of this signal. We establish a quantitative comparison between the signal-observation mutual information in any such problem with that in a simpler model with linear observations, using interpolation methods. For group synchronization, our result proves a replica formula for the asymptotic mutual information and Bayes-optimal mean-squared-error. Via analyses of this replica formula, we show that the conjectural phase transition threshold for computationally-efficient weak recovery of the signal is determined by a classification of the real-irreducible components of the observed group representation(s), and we fully characterize the information-theoretic limits of estimation in the example of angular/phase synchronization over $SO(2)$/$U(1)$. For quadratic assignment, we study observations given by a kernel matrix of pairwise similarities and a randomly permutated and noisy counterpart, and we show in a bounded signal-to-noise regime that the asymptotic mutual information coincides with that in a Bayesian spiked model with i.i.d. signal prior.

  • Maximum likelihood for high-noise group orbit estimation and single-particle cryo-EM

    The Annals of Statistics · 2024-02-01 · 5 citations

    articleOpen access1st authorCorresponding

    Motivated by applications to single-particle cryo-electron microscopy (cryo-EM), we study several problems of function estimation in a high noise regime, where samples are observed after random rotation and possible linear projection of the function domain. We describe a stratification of the Fisher information eigenvalues according to transcendence degrees of graded pieces of the algebra of group invariants, and we relate critical points of the log-likelihood landscape to a sequence of moment optimization problems, extending previous results for a discrete rotation group without projections. We then compute the transcendence degrees and forms of these optimization problems for several examples of function estimation under SO(2) and SO(3) rotations, including a simplified model of cryo-EM as introduced by Bandeira, Blum-Smith, Kileel, Niles-Weed, Perry and Wein. We affirmatively resolve conjectures that third-order moments are sufficient to locally identify a generic signal up to its rotational orbit in these examples. For low-dimensional approximations of the electric potential maps of two small protein molecules, we empirically verify that the noise scalings of the Fisher information eigenvalues conform with our theoretical predictions over a range of SNR, in a model of SO(3) rotations without projections.

  • Universality of approximate message passing algorithms and tensor networks

    The Annals of Applied Probability · 2024-08-01 · 16 citations

    articleSenior author

    Approximate message passing (AMP) algorithms provide a valuable tool for studying mean-field approximations and dynamics in a variety of applications. Although these algorithms are often first derived for matrices having independent Gaussian entries or satisfying rotational invariance in law, their state evolution characterizations are expected to hold over larger universality classes of random matrix ensembles. We develop several new results on AMP universality. For AMP algorithms tailored to independent Gaussian entries, we show that their state evolutions hold over broadly defined generalized Wigner and white noise ensembles, including matrices with heavy-tailed entries and heterogeneous entrywise variances that may arise in data applications. For AMP algorithms tailored to rotational invariance in law, we show that their state evolutions hold over delocalized sign-and-permutation-invariant matrix ensembles that have a limit distribution over the diagonal, including sensing matrices composed of subsampled Hadamard or Fourier transforms and diagonal operators. We establish these results via a simplified moment-method proof, reducing AMP universality to the study of products of random matrices and diagonal tensors along a tensor network. As a by-product of our analyses, we show that the aforementioned matrix ensembles satisfy a notion of asymptotic freeness with respect to such tensor networks, which parallels usual definitions of freeness for traces of matrix products.

  • Improving fine-mapping by modeling infinitesimal effects

    Nature Genetics · 2023-11-30 · 65 citations

    articleOpen accessCorresponding
  • Random linear estimation with rotationally-invariant designs: Asymptotics at high temperature

    2023-06-25 · 3 citations

    article

    We study estimation in the linear model y = Aβ <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">⋆</sup> + ϵ, in a Bayesian setting where β <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">⋆</sup> has an entrywise i.i.d. prior and the design A is rotationally-invariant in law. In the large system limit as dimension and sample size increase proportionally, a set of related conjectures have been postulated for the asymptotic mutual information, Bayes-optimal mean squared error, and TAP mean-field equations that characterize the Bayes posterior mean of β <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">⋆</sup> . In this work, we prove these conjectures for a general class of signal priors and for arbitrary rotationally-invariant designs A, under a "high-temperature" condition that restricts the range of eigenvalues of A <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">⊤</sup> A. Our proof uses a conditional second-moment method argument, where we condition on the iterates of a version of the Vector AMP algorithm for solving the TAP mean-field equations.

Recent grants

Frequent coauthors

  • H B Wang

    Shandong Provincial Hospital

    48 shared
  • Lei Xu

    Southwest University

    28 shared
  • Haibo Wang

    Shandong Institute of Metrology

    24 shared
  • Jianfen Luo

    Shandong University

    20 shared
  • R J Wang

    Shandong Provincial Hospital

    20 shared
  • Xiuhua Chao

    Shandong Provincial Hospital

    18 shared
  • Mingming Wang

    Second Hospital of Shandong University

    14 shared
  • Yuechen Han

    13 shared

Education

  • Ph.D., Statistics

    Stanford University

    2018
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Zhou Fan

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup