Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Yihong Wu

Yihong Wu

· James A. Attwood Professor of Statistics and Data ScienceVerified

Yale University · Department of Statistics and Data Science

Active 2008–2025

h-index37
Citations5.0k
Papers18566 last 5y
Funding$1.3M
See your match with Yihong Wu — sign in to PhdFit.Sign in

About

Yihong Wu is the James A. Attwood Professor of Statistics and Data Science and serves as the Chair of the Department of Statistics and Data Science at Yale University. His research broadly focuses on the theoretical and algorithmic aspects of high-dimensional statistics, information theory, and optimization. Wu's work addresses fundamental problems in these areas, contributing to the understanding and development of statistical methods and algorithms that are applicable to complex, high-dimensional data settings.

Research topics

  • Mathematics
  • Statistics
  • Combinatorics
  • Mathematical optimization
  • Algorithm
  • Mathematical analysis
  • Computer Science
  • Theoretical computer science
  • Discrete mathematics
  • Physics
  • Applied mathematics
  • Quantum mechanics

Selected publications

  • Optimal empirical Bayes estimation for the Poisson model via minimum-distance methods

    Information and Inference A Journal of the IMA · 2025-10-06 · 1 citations

    articleSenior author

    Abstract The Robbins estimator is the most iconic and widely used procedure in the empirical Bayes literature for the Poisson model. On the one hand, this method has been recently shown to be minimax optimal in terms of the regret (excess risk over the Bayesian oracle that knows the true prior) for various non-parametric classes of priors. On the other hand, it has been long recognized in practice that the Robbins estimator lacks the desired smoothness and monotonicity of Bayes estimators and can be easily derailed by those data points that were rarely observed before. Based on the minimum-distance distance method, we propose a suite of empirical Bayes estimators, including the classical non-parametric maximum likelihood, that outperform the Robbins method in a variety of synthetic and real data sets and retain its optimality in terms of minimax regret.

  • Random Graph Matching at Otter’s Threshold via Counting Chandeliers

    Operations Research · 2025-04-18

    article

    Network alignment or graph matching—figuring out how vertices across different networks correspond to each other—is a key challenge in many fields, from protecting online privacy to mapping biological data, improving computer vision, and even understanding languages. However, this problem falls into the class of notoriously difficult quadratic assignment problems, which are NP-hard to solve or approximate. Despite these challenges, researchers Mao, Wu, Xu, and Yu have made a major breakthrough. In their paper, “Random Graph Matching at Otter’s Threshold via Counting Chandeliers,” they introduce an innovative algorithm that can successfully match two random networks whenever the square of their edge correlation exceeds Otter’s constant (≈0.338). Their key innovation lies in counting chandeliers—specially designed tree-like structures—to identify corresponding vertices across the networks. The algorithm correctly matches nearly all vertices with high probability and even achieves perfect matching whenever the data allows. This is the first-ever polynomial-time algorithm capable of achieving perfect and near-perfect matching with an explicit constant correlation for both dense and sparse networks, bridging a long-standing gap between statistical limits and algorithmic performance.

  • Redundant Trees in Bipartite Graphs

    Mathematics · 2025-03-19 · 1 citations

    articleOpen access

    It has been conjectured that for each positive integer k and each tree T with bipartite (Z1,Z2), every k-connected bipartite graph G with δ(G)≥k+max{|Z1|,|Z2|} admits a subgraph T′≅T such that G−V(T′) is still k-connected. In this paper, we generalize the ear decompositions of 2-connected graphs into a (k,ak)-extensible system for a general k-connected graph. As a result, we confirm the conjecture for k≤3 by proving a slightly stronger version of it.

  • The broken sample problem revisited: Proof of a conjecture by Bai-Hsing and high-dimensional extensions

    ArXiv.org · 2025-03-18

    preprintOpen access

    We revisit the classical broken sample problem: Two samples of i.i.d. data points $\mathbf{X}=\{X_1,\cdots, X_n\}$ and $\mathbf{Y}=\{Y_1,\cdots,Y_m\}$ are observed without correspondence with $m\leq n$. Under the null hypothesis, $\mathbf{X}$ and $\mathbf{Y}$ are independent. Under the alternative hypothesis, $\mathbf{Y}$ is correlated with a random subsample of $\mathbf{X}$, in the sense that $(X_{π(i)},Y_i)$'s are drawn independently from some bivariate distribution for some latent injection $π:[m] \to [n]$. Originally introduced by DeGroot, Feder, and Goel (1971) to model matching records in census data, this problem has recently gained renewed interest due to its applications in data de-anonymization, data integration, and target tracking. Despite extensive research over the past decades, determining the precise detection threshold has remained an open problem even for equal sample sizes ($m=n$). Assuming $m$ and $n$ grow proportionally, we show that the sharp threshold is given by a spectral and an $L_2$ condition of the likelihood ratio operator, resolving a conjecture of Bai and Hsing (2005) in the positive. These results are extended to high dimensions and settle the sharp detection thresholds for Gaussian and Bernoulli models.

  • On the Best Approximation by Finite Gaussian Mixtures

    IEEE Transactions on Information Theory · 2025-04-08 · 1 citations

    article

    We consider the problem of approximating a general Gaussian location mixture by finite mixtures. The minimum order of finite mixtures that achieve a prescribed accuracy is determined within constant factors for the family of mixing distributions with compact support or appropriate assumptions on the tail probability including subgaussian and subexponential. While the upper bound is achieved using the technique of local moment matching, the lower bound is established by relating the best approximation error to the low-rank approximation of certain trigonometric moment matrices, followed by a refined spectral analysis of their minimum eigenvalue. In the case of Gaussian mixing distributions, this result corrects a previous lower bound in [2].

  • The cardiac electrophysiology-inspired patches for repairing myocardial infarction: A review

    Smart Materials in Medicine · 2025-01-05 · 5 citations

    reviewOpen access

    Myocardial infarction has been a serious threat to human health due to its high morbidity and mortality all over the world. The major problem is the loss of limited regenerative cardiomyocytes and occurrence of inflammatory response, leading to the formation of non-contractile and non-conducting fibrotic scar tissue. Thus, it disrupts the mechano-electric coupling system of the heart, negatively influencing the heart function. Recently, the conductive cardiac patches with advantage of reconstructing electrical propagation have been extensively applied for cardiac repair. This review introduces a detailed overview of the recent progress in cardiac electrophysiology-inspired patches for cardiac repair from three parts of the construction and functionality of mechano-electric coupling cardiac patches, the construction and functionality of microstructure of the cardiac patches, the realtime detection based on mechano-electric transformation. Finally, the achievements and future perspective of conductive cardiac patches is discussed from the aspects of biosafety, further exploration of factors affecting mechano-electric coupling in cardiac patches and regulation of detection. It is hopeful to help researchers understand the functional components and development of conductive cardiac patches for cardiac repair, as well as to inspire them to synthesize novel cardiac patches for promoting clinical translation. • Myocardial infarction has become a serious threat to human health due to its high morbidity and mortality all over the world. The major problem is the loss of limited regenerative cardiomyocytes and occurrence of inflammatory response, leading to the formation of non-contractile and non-conducting fibrotic scar tissue. Thus, it disrupts the mechano-electric coupling system of heart, negatively influencing the heart function. Recently, the conductive cardiac patches with advantage of reconstructing electrical propagation have been extensively applied for cardiac repair. • This review introduces a detailed overview of the recent progress in cardiac electrophysiology-inspired patches for cardiac repair from three parts of the construction and functionality of mechano-electric coupling cardiac patchs, the construction and functionality of microstructure of cardiac patchs, the realtime detection based on mechano-electric transformation. Finally, the achievements and future perspective of conductive cardiac patches is discussed from the aspects of biosafety, exploration of influencing mechano-electric coupling system and regulation of detection. It is hopeful to help researchers understand the functional component and development of conductive cardiac patches for cardiac repair and inspire them to synthesize novel cardiac patches for promoting clinical translation.

  • Numerical simulation of the effects of particle density and size on particle distribution in a laboratory-scale curved open-channel flow

    Advances in Water Resources · 2025-08-18

    article
  • Improving Distributed Network Resilience with Energy Storage: An Optimal Planning Strategy Based on Subjective and Objective Weight Method

    Distributed Generation & Alternative Energy Journal · 2024-12-24

    articleOpen access

    The integration of large-scale distributed photovoltaics (PVs) has improved the conventional resilience of distribution networks to a certain extent, but it has also made the power quality problems of distribution networks more prominent under steady-state operation. At the same time, the increase in the proportion of sensitive loads has also made the impact of voltage sag events increasingly serious, resulting in equipment damage and significant economic losses on the user side due to power quality problems when the conventional resilience assessment results of distribution networks are high. Based on this, this paper proposes an optimal planning strategy for improving the resilience of distributed networks based on subject and objective weight method. Firstly, for the proposed resilience assessment indicators, the improved Analytic Hierarchy Process (AHP) is used to calculate the subjective weights of the indicators, and the entropy weight method is used to calculate the objective weights of the indicators. The optimal weight combining subjectivity and objectivity is obtained comprehensively. Secondly, by combining the proposed resilience and power quality indicators, a comprehensive resilience indicator objective function is established. Based on the second-order cone linearization method, a multi-objective energy storage (ES) optimization configuration model with the lowest daily operation cost and the optimal comprehensive resilience of the distribution network is established. Finally, based on IEEE 33 node simulation, the comparison of calculation examples shows that the proposed energy storage optimization configuration model can effectively reduce system economic costs, while improving the resilience and power quality level of the distribution network.

  • On the best approximation by finite Gaussian mixtures

    arXiv (Cornell University) · 2024-04-13

    preprintOpen access

    We consider the problem of approximating a general Gaussian location mixture by finite mixtures. The minimum order of finite mixtures that achieve a prescribed accuracy (measured by various $f$-divergences) is determined within constant factors for the family of mixing distributions with compactly support or appropriate assumptions on the tail probability including subgaussian and subexponential. While the upper bound is achieved using the technique of local moment matching, the lower bound is established by relating the best approximation error to the low-rank approximation of certain trigonometric moment matrices, followed by a refined spectral analysis of their minimum eigenvalue. In the case of Gaussian mixing distributions, this result corrects a previous lower bound in [Allerton Conference 48 (2010) 620-628].

  • Information Theory

    Cambridge University Press eBooks · 2024-12-31 · 63 citations

    bookSenior author

    This enthusiastic introduction to the fundamentals of information theory builds from classical Shannon theory through to modern applications in statistical learning, equipping students with a uniquely well-rounded and rigorous foundation for further study. Introduces core topics such as data compression, channel coding, and rate-distortion theory using a unique finite block-length approach. With over 210 end-of-part exercises and numerous examples, students are introduced to contemporary applications in statistics, machine learning and modern communication theory. This textbook presents information-theoretic methods with applications in statistical learning and computer science, such as f-divergences, PAC Bayes and variational principle, Kolmogorov's metric entropy, strong data processing inequalities, and entropic upper bounds for statistical estimation. Accompanied by a solutions manual for instructors, and additional standalone chapters on more specialized topics in information theory, this is the ideal introductory textbook for senior undergraduate and graduate students in electrical engineering, statistics, and computer science.

Recent grants

Frequent coauthors

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Yihong Wu

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup