William Bialek

· John Archibald Wheeler/Battelle Professor in Theoretical PhysicsVerified

Princeton University · Physics

Active 1938–2025

h-index80

Citations33.3k

Papers39369 last 5y

Funding$26.0M

Faculty page

See your match with William Bialek — sign in to PhdFit.Sign in

About

William Bialek is a Professor of Physics and a co-Director at the Center for the Physics of Biological Function (CPBF), an NSF Physics Frontier Center. He is also a Lewis-Sigler Institute faculty member at Princeton University and holds the position of Visiting Professor of Physics at The Graduate Center, CUNY. His research focuses on the physics of biological function, contributing to the understanding of biological systems through the lens of physics. Bialek's work involves exploring the fundamental principles underlying biological processes, integrating concepts from physics to analyze complex biological phenomena.

Research topics

Computer science
Statistical physics
Physics
Artificial intelligence
Mathematics

Selected publications

Foreword
Princeton University Press eBooks · 2025-12-04
book-chapter1st authorCorresponding
Publisher DOI
Neural subspaces, minimax entropy, and mean-field theory for networks of neurons
ArXiv.org · 2025-08-04
preprintOpen accessSenior author
Recent advances in experimental techniques enable the simultaneous recording of activity from thousands of neurons in the brain, presenting both an opportunity and a challenge: to build meaningful, scalable models of large neural populations. Correlations in the brain are typically weak but widespread, suggesting that a mean-field approach might be effective in describing real neural populations, and we explore a hierarchy of maximum entropy models guided by this idea. We begin with models that match only the mean and variance of the total population activity, and extend to models that match the experimentally observed mean and variance of activity along multiple projections of the neural state. Confronted by data from several different brain regions, these models are driven toward a first-order phase transition, characterized by the presence of two nearly degenerate minima in the energy landscape, and this leads to predictions in qualitative disagreement with other features of the data. To resolve this problem we introduce a novel class of models that constrain the full probability distribution of activity along selected projections. We develop the mean-field theory for this class of models and apply it to recordings from 1000+ neurons in the mouse hippocampus. This 'distributional mean--field' model provides an accurate and consistent description of the data, offering a scalable and principled approach to modeling complex neural population dynamics.
Publisher OA PDF DOI
Context dependent adaptation in a neural computation
ArXiv.org · 2025-09-01
preprintOpen access
Brains adapt to the statistical structure of their input. In the visual system, local light intensities change rapidly, the variance of the intensity changes more slowly, and the dynamic range of contrast itself changes more slowly still. We use a motion-sensitive neuron in the fly visual system to probe this hierarchy of adaptation phenomena, delivering naturalistic stimuli that have been simplified to have a clear separation of time scales. We show that the neural response to visual motion depends on contrast, and this dependence itself varies with context. Using the spike-triggered average velocity trajectory as a response measure, we find that context dependence is confined to a low-dimensional space, with a single dominant dimension. Across a wide range of conditions this adaptation serves to match the integration time to the mean interval between spikes, reducing redundancy.
Publisher OA PDF DOI
Maximum entropy models for patterns of gene expression
Physical review. E · 2025-06-24 · 2 citations
articleSenior author
New experimental methods make it possible to measure the expression levels of many genes, simultaneously, in snapshots from thousands or even millions of individual cells. Current approaches to analyze these experiments involve clustering or low-dimensional projections, and often start with the assumption that distinct cell types exist. Here we use the principle of maximum entropy to obtain a probabilistic description that captures the observed presence or absence of mRNAs from hundreds of genes in cells from the mammalian brain. We construct the Ising model compatible with experimental means and pairwise correlations, and validate it by showing that it gives good predictions for higher-order statistics. We find that the probability distribution of cell states has many local maxima. Grouping cells according to these maxima (or energy minima) gives a classification in good agreement with currently assigned cell types. We show that when assignments disagree our model is dividing cell types into subtypes with clearly distinguishable expression patterns. These results make concrete the intuition that types or classes of cells are emergent behaviors.
Publisher DOI
Optimization and variability can coexist.
PubMed · 2025-05-29
preprintOpen access
that we should observe widely varying parameters, and we make this precise: the entropy in parameter space can be extensive even if performance on average is very close to optimal. This removes a major objection to optimization as a general principle, and rationalizes the observed variability.
Publisher OA PDF
When many noisy genes optimize information flow
ArXiv.org · 2025-12-16
preprintOpen accessSenior author
It often is emphasized that gene expression is noisy. A seemingly contradictory view is that control mechanisms have been optimized to squeeze as much information as possible out of a limited number of molecules. Here we revisit these issues in a simple model where a single transcription factor (TF) controls a large number of target genes. We include only the physically required noise sources: random arrival of TFs at their targets and counting noise in the synthesis and degradation of mRNA. If the cell has a limited total number of mRNA molecules, then the capacity to transmit information about TF concentration is maximized when these resources are distributed across the largest possible number of target genes. To realize this capacity the distribution of TF concentrations must be biased toward smaller values. Thus, in some limits, information transmission is optimized when individual expression levels are noisy. In addition, the dependence of information transmission on the parameters of this multi-gene system has a "sloppy" spectrum, so that optimal performance can co-exist with substantial variability.
Publisher OA PDF DOI
Foreword
Princeton University Press eBooks · 2025-10-28
book-chapter1st authorCorresponding
Publisher DOI
Exactly solvable statistical physics models for large neuronal populations
Physical Review Research · 2025-05-19 · 6 citations
preprintOpen access
Maximum-entropy methods provide a principled path connecting measurements of neural activity directly to statistical physics models, and this approach has been successful for populations of <a:math xmlns:a="http://www.w3.org/1998/Math/MathML"><a:mrow><a:mi>N</a:mi><a:mo>∼</a:mo><a:mn>100</a:mn></a:mrow></a:math> neurons. As <b:math xmlns:b="http://www.w3.org/1998/Math/MathML"><b:mi>N</b:mi></b:math> increases in new experiments, we enter an undersampled regime where we have to choose which observables should be constrained in the maximum-entropy construction. The best choice is the one that provides the greatest reduction in entropy, defining a “minimax entropy” principle. This principle becomes tractable if we restrict attention to correlations among pairs of neurons that link together into a tree; we can find the best tree efficiently, and the underlying statistical physics models are exactly solved. We use this approach to analyze experiments on <c:math xmlns:c="http://www.w3.org/1998/Math/MathML"><c:mrow><c:mi>N</c:mi><c:mo>∼</c:mo><c:mn>1500</c:mn></c:mrow></c:math> neurons in the mouse hippocampus, and we find that the resulting model captures key features of collective activity in the network.
Publisher OA PDF DOI
Deriving a genetic regulatory network from an optimization principle
Proceedings of the National Academy of Sciences · 2025-01-03 · 18 citations
articleOpen accessCorresponding
Many biological systems operate near the physical limits to their performance, suggesting that aspects of their behavior and underlying mechanisms could be derived from optimization principles. However, such principles have often been applied only in simplified models. Here, we explore a detailed mechanistic model of the gap gene network in the Drosophila embryo, optimizing its 50+ parameters to maximize the information that gene expression levels provide about nuclear positions. This optimization is conducted under realistic constraints, such as limits on the number of available molecules. Remarkably, the optimal networks we derive closely match the architecture and spatial gene expression profiles observed in the real organism. Our framework quantifies the tradeoffs involved in maximizing functional performance and allows for the exploration of alternative network configurations, addressing the question of which features are necessary and which are contingent. Our results suggest that multiple solutions to the optimization problem might exist across closely related organisms, offering insights into the evolution of gene regulatory networks.
Publisher DOI
Large language models and the entropy of English
arXiv (Cornell University) · 2025-12-31
preprintOpen accessSenior author
We use large language models (LLMs) to uncover long-ranged structure in English texts from a variety of sources. The conditional entropy or code length in many cases continues to decrease with context length at least to $N\sim 10^4$ characters, implying that there are direct dependencies or interactions across these distances. A corollary is that there are small but significant correlations between characters at these separations, as we show from the data independent of models. The distribution of code lengths reveals an emergent certainty about an increasing fraction of characters at large $N$. Over the course of model training, we observe different dynamics at long and short context lengths, suggesting that long-ranged structure is learned only gradually. Our results constrain efforts to build statistical physics models of LLMs or language itself.
Publisher DOI

Recent grants

Mechanisms of neural circuit dynamics in working memory
NIH · $3.1M · 2014–2018
NIH Grant R01GM077599
NIH · $1.2M · 2012
A new paradigm for quantifying animal behavior in a model genetic system
NIH · $1.6M · 2011–2016
Coarse-graining approaches to networks, learning, and behavior
NIH · $707k · 2018–2022
Coarse-graining approaches to networks, learning, and behavior
NIH · $387k · 2018–2021

Frequent coauthors

Thomas Gregor
Centre National de la Recherche Scientifique
72 shared
Eric Wieschaus
Princeton University
43 shared
Gašper Tkačik
Institute of Science and Technology Austria
38 shared
Christopher W. Lynn
The Graduate Center, CUNY
30 shared
Rob R. de Ruyter van Steveninck
Indiana University Bloomington
28 shared
David W. Tank
Princeton University
26 shared
Mariela D. Petkova
Harvard University
23 shared
Olivier Marre
Sorbonne Université
23 shared

Labs

Center for the Physics of Biological FunctionPI

Education

Postdoctoral, Theoretical Physics
University of California, Santa Barbara
1986
Postdoctoral, Physics
Rijksuniversiteit Groningen
1984
PhD, Biophysics
University of California, Berkeley
1983
AB, Biophysics
University of California, Berkeley
1979

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with William Bialek

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you