
Ilias Zadik
· Assistant Professor of Statistics and Data ScienceYale University · Department of Statistics and Data Science
Active 2012–2026
Research topics
- Artificial Intelligence
- Physics
- Combinatorics
- Computer Science
- Mathematics
- Algorithm
- Pure mathematics
- Statistics
- Quantum mechanics
- Geometry
Selected publications
Node-Private Community Detection in Stochastic Block Models
ArXiv.org · 2026-04-10
articleOpen accessSenior authorWe study community detection in stochastic block models under pure node-level differential privacy, a stringent notion that protects the participation of an individual together with all of their incident edges. This setting is substantially more challenging than edge-private community detection, since modifying a single node can affect linearly many observations. On the algorithmic side, we analyze a node-private estimator based on the exponential mechanism combined with an extension lemma, and show that exact recovery remains achievable. In the standard sparse regime with logarithmic average degree and a fixed number of communities, our results imply that a logarithmic privacy budget suffices to obtain nontrivial recovery guarantees. On the lower bound side, we show that this logarithmic scaling is in fact unavoidable: any pure node-private method must fail to achieve polynomially small exact-recovery error, or polynomially small expected mismatch, unless the privacy budget is at least of this order. Moreover, in the regime of super-logarithmic privacy budgets, our upper and lower bounds yield a matching two-term characterization of the minimax risk, with one term governed by the non-private statistical signal and the other by the privacy budget; these match up to universal constants in the exponents. Taken together, our results identify an inherent logarithmic privacy cost in node-private community detection, absent under edge differential privacy, and provide a precise rate-level characterization of the tradeoff between node privacy and SBM recovery.
Node-Private Community Detection in Stochastic Block Models
HAL (Le Centre pour la Communication Scientifique Directe) · 2026-04-10
preprintOpen accessSenior authorWe study community detection in stochastic block models under pure node-level differential privacy, a stringent notion that protects the participation of an individual together with all of their incident edges. This setting is substantially more challenging than edge-private community detection, since modifying a single node can affect linearly many observations. On the algorithmic side, we analyze a node-private estimator based on the exponential mechanism combined with an extension lemma, and show that exact recovery remains achievable. In the standard sparse regime with logarithmic average degree and a fixed number of communities, our results imply that a logarithmic privacy budget suffices to obtain nontrivial recovery guarantees. On the lower bound side, we show that this logarithmic scaling is in fact unavoidable: any pure node-private method must fail to achieve polynomially small exact-recovery error, or polynomially small expected mismatch, unless the privacy budget is at least of this order. Moreover, in the regime of super-logarithmic privacy budgets, our upper and lower bounds yield a matching two-term characterization of the minimax risk, with one term governed by the non-private statistical signal and the other by the privacy budget; these match up to universal constants in the exponents. Taken together, our results identify an inherent logarithmic privacy cost in node-private community detection, absent under edge differential privacy, and provide a precise rate-level characterization of the tradeoff between node privacy and SBM recovery.
Stable Algorithms Lower Bounds for Estimation
ArXiv.org · 2026-03-23
articleOpen accessSenior authorIn this work, we show that for all statistical estimation problems, a natural MMSE instability (discontinuity) condition implies the failure of stable algorithms, serving as a version of OGP for estimation tasks. Using this criterion, we establish separations between stable and polynomial-time algorithms for the following MMSE-unstable tasks (i) Planted Shortest Path, where Dijkstra's algorithm succeeds, (ii) random Parity Codes, where Gaussian elimination succeeds, and (iii) Gaussian Subset Sum, where lattice-based methods succeed. For all three, we further show that all low-degree polynomials are stable, yielding separations against low-degree methods and a new method to bound the low-degree MMSE. In particular, our technique highlights that MMSE instability is a common feature for Shortest Path and the noiseless Parity Codes and Gaussian subset sum. Last, we highlight that our work places rigorous algorithmic footing on the long-standing physics belief that first-order phase transitions--which in this setting translates to MMSE-instability impose fundamental limits on classes of efficient algorithms.
The monotonicity of the Franz-Parisi potential is equivalent with Low-degree MMSE lower bounds
arXiv (Cornell University) · 2026-03-20
preprintOpen accessSenior authorOver the last decades, two distinct approaches have been instrumental to our understanding of the computational complexity of statistical estimation. The statistical physics literature predicts algorithmic hardness through local stability and monotonicity properties of the Franz--Parisi (FP) potential \cite{franz1995recipes,franz1997phase}, while the mathematically rigorous literature characterizes hardness via the limitations of restricted algorithmic classes, most notably low-degree polynomial estimators \cite{hopkins2017efficient}. For many inference models, these two perspectives yield strikingly consistent predictions, giving rise to a long-standing open problem of establishing a precise mathematical relationship between them. In this work, we show that for estimation problems the power of low-degree polynomials is equivalent to the monotonicity of the annealed FP potential for a broad family of Gaussian additive models (GAMs) with signal-to-noise ratio $λ$. In particular, subject to a low-degree conjecture for GAMs, our results imply that the polynomial-time limits of these models are directly implied by the monotonicity of the annealed FP potential, in conceptual agreement with predictions from the physics literature dating back to the 1990s.
Almost-Optimal Local-Search Methods for Sparse Tensor PCA
ArXiv.org · 2025-06-11
preprintOpen accessSenior authorLocal-search methods are widely employed in statistical applications, yet interestingly, their theoretical foundations remain rather underexplored, compared to other classes of estimators such as low-degree polynomials and spectral methods. Of note, among the few existing results recent studies have revealed a significant "local-computational" gap in the context of a well-studied sparse tensor principal component analysis (PCA), where a broad class of local Markov chain methods exhibits a notable underperformance relative to other polynomial-time algorithms. In this work, we propose a series of local-search methods that provably "close" this gap to the best known polynomial-time procedures in multiple regimes of the model, including and going beyond the previously studied regimes in which the broad family of local Markov chain methods underperforms. Our framework includes: (1) standard greedy and randomized greedy algorithms applied to the (regularized) posterior of the model; and (2) novel random-threshold variants, in which the randomized greedy algorithm accepts a proposed transition if and only if the corresponding change in the Hamiltonian exceeds a random Gaussian threshold-rather that if and only if it is positive, as is customary. The introduction of the random thresholds enables a tight mathematical analysis of the randomized greedy algorithm's trajectory by crucially breaking the dependencies between the iterations, and could be of independent interest to the community.
Counting stars is constant-degree optimal for detecting any planted subgraph
Mathematical Statistics and Learning · 2025-08-14
articleOpen accessWe study the computational limits of the following general hypothesis testing problem. Let H=H_{n} be an arbitrary undirected graph. We study the detection task between a “null” Erdős–Rényi random graph G(n,p) and a “planted” random graph which is the union of G(n,p) together with a random copy of H=H_{n} . Our notion of planted model is a generalization of a plethora of recently studied models initiated with the study of the planted clique model (Jerrum, 1992), which corresponds to the special case where H is a k -clique and p=1/2 .Over the last decade, several papers have studied the power of low-degree polynomials for limited choices of H ’s in the above task. In this work, we adopt a unifying perspective and characterize the power of constant degree polynomials for the detection task, when H=H_{n} is any arbitrary graph and for any p=\Omega(1) . Perhaps surprisingly, we prove that an optimal constant degree polynomial is always given by simply counting stars in the input random graph. As a direct corollary, we conclude that the class of constant-degree polynomials is only able to “sense” the degree distribution of the planted graph H , and no other graph theoretic property of it.
Sharp thresholds in inference of planted subgraphs
The Annals of Applied Probability · 2025-02-01
articleSenior authorAlmost‐Linear Planted Cliques Elude the Metropolis Process
Random Structures and Algorithms · 2025-02-21
articleSenior authorABSTRACT A seminal work of Jerrum (1992) showed that large cliques elude the Metropolis process. More specifically, Jerrum showed that the Metropolis algorithm cannot find a clique of size for , which is planted in the Erdős‐Rényi , in polynomial‐time. Information theoretically, it is possible to find such planted cliques when . Since the work of Jerrum, the computational problem of finding a planted clique in was studied extensively, and many polynomial‐time algorithms were shown to find the planted clique if it is of size , while no polynomial‐time algorithm is known to work when . The planted clique problem for is now widely considered a foundational problem in the study of computational‐statistical gaps. Notably, the first evidence of the problem's algorithmic hardness is commonly attributed to Jerrum (1992). In this paper, we revisit the original Metropolis algorithm suggested by Jerrum. Interestingly, we find that the Metropolis algorithm actually fails to recover a planted clique of size for any constant , unlike many other efficient algorithms that succeed when . Moreover, like many results in the MCMC literature, the result of Jerrum shows that there exists a starting state for which the Metropolis algorithm fails. For a wide range of temperatures, we show that the algorithm fails when started at the most natural initial state, which is the empty clique. This answers an open problem from Jerrum (1992).
An Optimized Franz-Parisi Criterion and its Equivalence with SQ Lower Bounds
ArXiv.org · 2025-06-06
preprintOpen accessBandeira et al. (2022) introduced the Franz-Parisi (FP) criterion for characterizing the computational hard phases in statistical detection problems. The FP criterion, based on an annealed version of the celebrated Franz-Parisi potential from statistical physics, was shown to be equivalent to low-degree polynomial (LDP) lower bounds for Gaussian additive models, thereby connecting two distinct approaches to understanding the computational hardness in statistical inference. In this paper, we propose a refined FP criterion that aims to better capture the geometric ``overlap" structure of statistical models. Our main result establishes that this optimized FP criterion is equivalent to Statistical Query (SQ) lower bounds -- another foundational framework in computational complexity of statistical inference. Crucially, this equivalence holds under a mild, verifiable assumption satisfied by a broad class of statistical models, including Gaussian additive models, planted sparse models, as well as non-Gaussian component analysis (NGCA), single-index (SI) models, and convex truncation detection settings. For instance, in the case of convex truncation tasks, the assumption is equivalent with the Gaussian correlation inequality (Royen, 2014) from convex geometry. In addition to the above, our equivalence not only unifies and simplifies the derivation of several known SQ lower bounds -- such as for the NGCA model (Diakonikolas et al., 2017) and the SI model (Damian et al., 2024) -- but also yields new SQ lower bounds of independent interest, including for the computational gaps in mixed sparse linear regression (Arpino et al., 2023) and convex truncation (De et al., 2023).
The Fundamental Limits of Recovering Planted Subgraphs
ArXiv.org · 2025-03-19
preprintOpen accessSenior authorGiven an arbitrary subgraph $H=H_n$ and $p=p_n \in (0,1)$, the planted subgraph model is defined as follows. A statistician observes the union a random copy $H^*$ of $H$, together with random noise in the form of an instance of an Erdos-Renyi graph $G(n,p)$. Their goal is to recover the planted $H^*$ from the observed graph. Our focus in this work is to understand the minimum mean squared error (MMSE) for sufficiently large $n$. A recent paper [MNSSZ23] characterizes the graphs for which the limiting MMSE curve undergoes a sharp phase transition from $0$ to $1$ as $p$ increases, a behavior known as the all-or-nothing phenomenon, up to a mild density assumption on $H$. In this paper, we provide a formula for the limiting MMSE curve for any graph $H=H_n$, up to the same mild density assumption. This curve is expressed in terms of a variational formula over pairs of subgraphs of $H$, and is inspired by the celebrated subgraph expectation thresholds from the probabilistic combinatorics literature [KK07]. Furthermore, we give a polynomial-time description of the optimizers of this variational problem. This allows one to efficiently approximately compute the MMSE curve for any dense graph $H$ when $n$ is large enough. The proof relies on a novel graph decomposition of $H$ as well as a new minimax theorem which may be of independent interest. Our results generalize to the setting of minimax rates of recovering arbitrary monotone boolean properties planted in random noise, where the statistician observes the union of a planted minimal element $A \subseteq [N]$ of a monotone property and a random $Ber(p)^{\otimes N}$ vector. In this setting, we provide a variational formula inspired by the so-called "fractional" expectation threshold [Tal10], again describing the MMSE curve (in this case up to a multiplicative constant) for large enough $n$.
Frequent coauthors
- 20 shared
David Gamarnik
- 16 shared
Juan Pablo Vielma
Google (United States)
- 16 shared
Miles Lubin
Google (United States)
- 12 shared
Eren C. Kızıldağ
Columbia University
- 9 shared
Jonathan Niles‐Weed
- 7 shared
Jiaming Xu
- 7 shared
Alexander S. Wein
University of California, Davis
- 7 shared
Galen Reeves
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Ilias Zadik
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup