Zhou Fan

· Associate Professor of Statistics & Data ScienceVerified

Yale University · Department of Statistics and Data Science

Active 2001–2025

h-index14

Citations567

Papers10566 last 5y

Funding$583k1 active

Faculty page Lab page

See your match with Zhou Fan — sign in to PhdFit.Sign in

About

Zhou Fan is an Associate Professor in the Department of Statistics and Data Science at Yale University. His research interests lie at the intersection of mathematical statistics, probability theory, and computational algorithms. He focuses on inferential problems that arise in scientific applications, particularly in statistical genetics and computational biology. His work involves developing theoretical and computational methods to address complex statistical challenges in these fields.

Research topics

Computer Science
Algorithm
Mathematics
Sociology
Statistics
Theoretical computer science
Cartography
Combinatorics
Mathematical optimization
Engineering
World Wide Web
Applied mathematics
Civil engineering
Geography
Transport engineering

Selected publications

When does Gaussian equivalence fail and how to fix it: Non-universal behavior of random features with quadratic scaling
ArXiv.org · 2025-12-03
preprintOpen access
A major effort in modern high-dimensional statistics has been devoted to the analysis of linear predictors trained on nonlinear feature embeddings via empirical risk minimization (ERM). Gaussian equivalence theory (GET) has emerged as a powerful universality principle in this context: it states that the behavior of high-dimensional, complex features can be captured by Gaussian surrogates, which are more amenable to analysis. Despite its remarkable successes, numerical experiments show that this equivalence can fail even for simple embeddings -- such as polynomial maps -- under general scaling regimes. We investigate this breakdown in the setting of random feature (RF) models in the quadratic scaling regime, where both the number of features and the sample size grow quadratically with the data dimension. We show that when the target function depends on a low-dimensional projection of the data, such as generalized linear models, GET yields incorrect predictions. To capture the correct asymptotics, we introduce a Conditional Gaussian Equivalent (CGE) model, which can be viewed as appending a low-dimensional non-Gaussian component to an otherwise high-dimensional Gaussian model. This hybrid model retains the tractability of the Gaussian framework and accurately describes RF models in the quadratic scaling regime. We derive sharp asymptotics for the training and test errors in this setting, which continue to agree with numerical simulations even when GET fails. Our analysis combines general results on CLT for Wiener chaos expansions and a careful two-phase Lindeberg swapping argument. Beyond RF models and quadratic scaling, our work hints at a rich landscape of universality phenomena in high-dimensional ERM.
Publisher OA PDF DOI
Asymptotic mutual information in quadratic estimation problems over compact groups
Information and Inference A Journal of the IMA · 2025-06-27 · 2 citations
articleSenior author
Abstract Motivated by applications to group synchronization and quadratic assignment on random data, we study a general problem of Bayesian inference of an unknown ‘signal’ belonging to a high-dimensional compact group, given noisy pairwise observations of a featurization of this signal. We establish a quantitative comparison between the signal-observation mutual information in any such problem with that in a simpler model with linear observations, using interpolation methods. For group synchronization, our result proves a replica formula for the asymptotic mutual information and Bayes-optimal mean-squared-error. Via analyses of this replica formula, we show that the conjectural phase transition threshold for computationally efficient weak recovery of the signal is determined by a classification of the real-irreducible components of the observed group representation(s), and we fully characterize the information-theoretic limits of estimation in the example of angular/phase synchronization over $\mathbb{SO}(2)$/$\mathbb{U}(1)$. For quadratic assignment, we study observations given by a kernel matrix of pairwise similarities and a randomly permutated and noisy counterpart, and we show in a bounded signal-to-noise regime that the asymptotic mutual information coincides with that in a Bayesian spiked model with i.i.d. signal prior.
Publisher DOI
On Universality of Non-Separable Approximate Message Passing Algorithms
ArXiv.org · 2025-06-28
preprintOpen accessSenior author
Mean-field characterizations of first-order iterative algorithms -- including Approximate Message Passing (AMP), stochastic and proximal gradient descent, and Langevin diffusions -- have enabled a precise understanding of learning dynamics in many statistical applications. For algorithms whose non-linearities have a coordinate-separable form, it is known that such characterizations enjoy a degree of universality with respect to the underlying data distribution. However, mean-field characterizations of non-separable algorithm dynamics have largely remained restricted to i.i.d. Gaussian or rotationally-invariant data. In this work, we initiate a study of universality for non-separable AMP algorithms. We identify a general condition for AMP with polynomial non-linearities, in terms of a Bounded Composition Property (BCP) for their representing tensors, to admit a state evolution that holds universally for matrices with non-Gaussian entries. We then formalize a condition of BCP-approximability for Lipschitz AMP algorithms to enjoy a similar universal guarantee. We demonstrate that many common classes of non-separable non-linearities are BCP-approximable, including local denoisers, spectral denoisers for generic signals, and compositions of separable functions with generic linear maps, implying the universality of state evolution for AMP algorithms employing these non-linearities.
Publisher OA PDF DOI
Approximate message passing for orthogonally invariant ensembles: multivariate non-linearities and spectral initialization
Information and Inference A Journal of the IMA · 2024-07-01 · 8 citations
articleOpen accessSenior author
Abstract We study a class of Approximate Message Passing (AMP) algorithms for symmetric and rectangular spiked random matrix models with orthogonally invariant noise. The AMP iterates have fixed dimension $K \geq 1$, a multivariate non-linearity is applied in each AMP iteration, and the algorithm is spectrally initialized with $K$ super-critical sample eigenvectors. We derive the forms of the Onsager debiasing coefficients and corresponding AMP state evolution, which depend on the free cumulants of the noise spectral distribution. This extends previous results for such models with $K=1$ and an independent initialization. Applying this approach to Bayesian principal components analysis, we introduce a Bayes-OAMP algorithm that uses as its non-linearity the posterior mean conditional on all preceding AMP iterates. We describe a practical implementation of this algorithm, where all debiasing and state evolution parameters are estimated from the observed data, and we illustrate the accuracy and stability of this approach in simulations.
Publisher OA PDF DOI
Rates of estimation for high-dimensional multireference alignment
The Annals of Statistics · 2024-02-01 · 1 citations
articleOpen access
We study the continuous multireference alignment model of estimating a periodic function on the circle from noisy and circularly-rotated observations. Motivated by analogous high-dimensional problems that arise in cryo-electron microscopy, we establish minimax rates for estimating generic signals that are explicit in the dimension K. In a high-noise regime with noise variance σ2≳K, for signals with Fourier coefficients of roughly uniform magnitude, the rate scales as σ6 and has no further dependence on the dimension. This rate is achieved by a bispectrum inversion procedure, and our analyses provide new stability bounds for bispectrum inversion that may be of independent interest. In a low-noise regime where σ2≲K/logK, the rate scales instead as Kσ2, and we establish this rate by a sharp analysis of the maximum likelihood estimator that marginalizes over latent rotations. A complementary lower bound that interpolates between these two regimes is obtained using Assouad’s hypercube lemma. We extend these analyses also to signals whose Fourier coefficients have a slow power law decay.
Publisher OA PDF DOI
Asymptotic mutual information in quadratic estimation problems over compact groups
arXiv (Cornell University) · 2024-04-15
preprintOpen accessSenior author
Motivated by applications to group synchronization and quadratic assignment on random data, we study a general problem of Bayesian inference of an unknown ``signal'' belonging to a high-dimensional compact group, given noisy pairwise observations of a featurization of this signal. We establish a quantitative comparison between the signal-observation mutual information in any such problem with that in a simpler model with linear observations, using interpolation methods. For group synchronization, our result proves a replica formula for the asymptotic mutual information and Bayes-optimal mean-squared-error. Via analyses of this replica formula, we show that the conjectural phase transition threshold for computationally-efficient weak recovery of the signal is determined by a classification of the real-irreducible components of the observed group representation(s), and we fully characterize the information-theoretic limits of estimation in the example of angular/phase synchronization over $SO(2)$/$U(1)$. For quadratic assignment, we study observations given by a kernel matrix of pairwise similarities and a randomly permutated and noisy counterpart, and we show in a bounded signal-to-noise regime that the asymptotic mutual information coincides with that in a Bayesian spiked model with i.i.d. signal prior.
Publisher OA PDF DOI
Maximum likelihood for high-noise group orbit estimation and single-particle cryo-EM
The Annals of Statistics · 2024-02-01 · 5 citations
articleOpen access1st authorCorresponding
Motivated by applications to single-particle cryo-electron microscopy (cryo-EM), we study several problems of function estimation in a high noise regime, where samples are observed after random rotation and possible linear projection of the function domain. We describe a stratification of the Fisher information eigenvalues according to transcendence degrees of graded pieces of the algebra of group invariants, and we relate critical points of the log-likelihood landscape to a sequence of moment optimization problems, extending previous results for a discrete rotation group without projections. We then compute the transcendence degrees and forms of these optimization problems for several examples of function estimation under SO(2) and SO(3) rotations, including a simplified model of cryo-EM as introduced by Bandeira, Blum-Smith, Kileel, Niles-Weed, Perry and Wein. We affirmatively resolve conjectures that third-order moments are sufficient to locally identify a generic signal up to its rotational orbit in these examples. For low-dimensional approximations of the electric potential maps of two small protein molecules, we empirically verify that the noise scalings of the Fisher information eigenvalues conform with our theoretical predictions over a range of SNR, in a model of SO(3) rotations without projections.
Publisher OA PDF DOI
Universality of approximate message passing algorithms and tensor networks
The Annals of Applied Probability · 2024-08-01 · 16 citations
articleSenior author
Approximate message passing (AMP) algorithms provide a valuable tool for studying mean-field approximations and dynamics in a variety of applications. Although these algorithms are often first derived for matrices having independent Gaussian entries or satisfying rotational invariance in law, their state evolution characterizations are expected to hold over larger universality classes of random matrix ensembles. We develop several new results on AMP universality. For AMP algorithms tailored to independent Gaussian entries, we show that their state evolutions hold over broadly defined generalized Wigner and white noise ensembles, including matrices with heavy-tailed entries and heterogeneous entrywise variances that may arise in data applications. For AMP algorithms tailored to rotational invariance in law, we show that their state evolutions hold over delocalized sign-and-permutation-invariant matrix ensembles that have a limit distribution over the diagonal, including sensing matrices composed of subsampled Hadamard or Fourier transforms and diagonal operators. We establish these results via a simplified moment-method proof, reducing AMP universality to the study of products of random matrices and diagonal tensors along a tensor network. As a by-product of our analyses, we show that the aforementioned matrix ensembles satisfy a notion of asymptotic freeness with respect to such tensor networks, which parallels usual definitions of freeness for traces of matrix products.
Publisher DOI
Improving fine-mapping by modeling infinitesimal effects
Nature Genetics · 2023-11-30 · 65 citations
articleOpen accessCorresponding
Publisher OA PDF DOI
Random linear estimation with rotationally-invariant designs: Asymptotics at high temperature
2023-06-25 · 3 citations
article
We study estimation in the linear model y = Aβ ⋆ + ϵ, in a Bayesian setting where β ⋆ has an entrywise i.i.d. prior and the design A is rotationally-invariant in law. In the large system limit as dimension and sample size increase proportionally, a set of related conjectures have been postulated for the asymptotic mutual information, Bayes-optimal mean squared error, and TAP mean-field equations that characterize the Bayes posterior mean of β ⋆ . In this work, we prove these conjectures for a general class of signal priors and for arbitrary rotationally-invariant designs A, under a "high-temperature" condition that restricts the range of eigenvalues of A ⊤ A. Our proof uses a conditional second-moment method argument, where we condition on the iterates of a version of the Vector AMP algorithm for solving the TAP mean-field equations.
Publisher DOI

Recent grants

CAREER: High-dimensional inference and applications to modern biology
NSF · $400k · 2022–2027
Non-Convex Landscapes and High-Dimensional Latent Variable Models
NSF · $183k · 2019–2022

Frequent coauthors

H B Wang
Shandong Provincial Hospital
48 shared
Lei Xu
Southwest University
28 shared
Haibo Wang
Shandong Institute of Metrology
24 shared
Jianfen Luo
Shandong University
20 shared
R J Wang
Shandong Provincial Hospital
20 shared
Xiuhua Chao
Shandong Provincial Hospital
18 shared
Mingming Wang
Second Hospital of Shandong University
14 shared
Yuechen Han
13 shared

Education

Ph.D., Statistics
Stanford University
2018

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Zhou Fan

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you