Carey Priebe

· Professor

Johns Hopkins University · Radiology and Radiological Science

Active 1988–2026

h-index45

Citations9.4k

Papers597179 last 5y

Funding—

Faculty page Lab page

See your match with Carey Priebe — sign in to PhdFit.Sign in

About

Carey Priebe is a professor in the Department of Applied Mathematics and Statistics at Johns Hopkins University, where he also serves as the Director of the Mathematical Institute for Data Science. His research interests are primarily concerned with computational statistics, kernel and mixture estimates, statistical pattern recognition, statistical image analysis, dimensionality reduction, model selection, and statistical inference for high-dimensional and graph data. He is a member of the Data Science and AI Institute. Priebe received a BS degree in mathematics from Purdue University in 1984, an MS degree in computer science from San Diego State University in 1988, and a Ph.D. in information technology (computational statistics) from George Mason University in 1993. He has been a professor at the Whiting School of Engineering since 1994. He is a senior member of IEEE, a lifetime member of the International Statistical Institute, and a Fellow of the American Statistical Association and the Institute for Mathematical Statistics. Priebe has received numerous awards and honors, including the 2010 American Statistical Association Distinguished Achievement Award, the 2011 McDonald Award for Excellence in Mentoring and Advising, and he was named one of six inaugural Vannevar Bush National Security Science and Engineering Faculty Fellows in 2008.

Research topics

Computer Science
Biology
Statistics
Sociology
Artificial Intelligence
Mathematics
Political Science
Combinatorics
Neuroscience
Evolutionary biology
Geography
Physical geography
Medicine
Economic geography
Computational biology
Remote sensing
Applied mathematics
Ecology
Pure mathematics
Demography
Archaeology
Discrete mathematics

Selected publications

Data Kernel Perspective Space Performance Guarantees for Synthetic Data from Transformer Models
Open MIND · 2026-02-04
preprint
Scarcity of labeled training data remains the long pole in the tent for building performant language technology and generative AI models. Transformer models -- particularly LLMs -- are increasingly being used to mitigate the data scarcity problem via synthetic data generation. However, because the models are black boxes, the properties of the synthetic data are difficult to predict. In practice it is common for language technology engineers to 'fiddle' with the LLM temperature setting and hope that what comes out the other end improves the downstream model. Faced with this uncertainty, here we propose Data Kernel Perspective Space (DKPS) to provide the foundation for mathematical analysis yielding concrete statistical guarantees for the quality of the outputs of transformer models. We first show the mathematical derivation of DKPS and how it provides performance guarantees. Next we show how DKPS performance guarantees can elucidate performance of a downstream task, such as neural machine translation models or LLMs trained using Contrastive Preference Optimization (CPO). Limitations of the current work and future research are also discussed.
DOI
Data Kernel Perspective Space Performance Guarantees for Synthetic Data from Transformer Models
arXiv (Cornell University) · 2026-02-04
articleOpen access
Scarcity of labeled training data remains the long pole in the tent for building performant language technology and generative AI models. Transformer models -- particularly LLMs -- are increasingly being used to mitigate the data scarcity problem via synthetic data generation. However, because the models are black boxes, the properties of the synthetic data are difficult to predict. In practice it is common for language technology engineers to 'fiddle' with the LLM temperature setting and hope that what comes out the other end improves the downstream model. Faced with this uncertainty, here we propose Data Kernel Perspective Space (DKPS) to provide the foundation for mathematical analysis yielding concrete statistical guarantees for the quality of the outputs of transformer models. We first show the mathematical derivation of DKPS and how it provides performance guarantees. Next we show how DKPS performance guarantees can elucidate performance of a downstream task, such as neural machine translation models or LLMs trained using Contrastive Preference Optimization (CPO). Limitations of the current work and future research are also discussed.
Publisher OA PDF
Optimizing the Induced Correlation in Omnibus Joint Graph Embeddings
Figshare · 2026-04-02
articleOpen access
Theoretical and empirical evidence suggests that joint graph embedding algorithms induce correlation across networks in the embedding space. In the Omnibus joint graph embedding framework, previous results delineated the dual effects of algorithm-induced and model-inherent correlations on the total correlation across embedded networks. Accounting for the algorithm-induced correlation is practically important, as suboptimal Omnibus constructions can lead to inferential losses. This work presents the first efforts to automate the Omnibus construction in order to address two key questions: the correlation–to–Omni problem and the flat correlation problem. In the flat correlation problem, we seek the minimum algorithm-induced flat correlation (i.e., the same across all graph pairs) produced via an Omnibus embedding, as minimal flat correlation best preserves individual graph structure in the embedding space. Working in a subspace of the fully general Omnibus matrices, we prove both a lower bound for this flat correlation and that the classical Omnibus construction induces maximal flat correlation. In the correlation–to–Omni problem, we present the corr2Omni algorithm to construct Omnibus embeddings that best preserve a given matrix of estimated pairwise graph correlations in the embedding space. In simulated and real data settings, we demonstrate the increased effectiveness of corr2Omni versus the classical Omnibus construction.
Publisher DOI
Optimizing the Induced Correlation in Omnibus Joint Graph Embeddings
Journal of Computational and Graphical Statistics · 2026-04-02
article
Publisher DOI
A mathematical framework for parameter recovery in large language models via a joint Euclidean mirror
arXiv (Cornell University) · 2026-04-08
articleOpen access
Understanding the behavior of black-box large language models and determining effective means of comparing their performance is a key task in modern machine learning. We consider how large language models respond to a specific query by analyzing how the distributions of responses vary over different values of tuning parameters. We frame this problem in a general mathematical setting, treating the mapping from model parameters to response distributions as a structured family of probability measures, endowed with a geometry via a dissimilarity measure. We show how dissimilarities between response distributions can be represented in low-dimensional Euclidean space through a joint Euclidean mirror surface encoding the underlying geometry, which permits both qualitative and quantitative analysis of large language models and provides insight into predicting response distributions for different values of tuning parameters. We propose an estimation procedure for the underlying joint Euclidean mirror based on observed samples from the response distributions, and we prove its asymptotic properties. Additionally, we propose a statistically consistent procedure to infer the value of an unknown model parameter based on samples from the corresponding response distribution and the estimated joint Euclidean mirror. In an experimental setting with large language models, we find that changes in different tuning parameter values correspond to distinct directions in the embedding space, making it possible to estimate the tuning parameters that were used to generate a given response.
Publisher OA PDF
Optimizing the Induced Correlation in Omnibus Joint Graph Embeddings
Figshare · 2026-04-02
articleOpen access
Theoretical and empirical evidence suggests that joint graph embedding algorithms induce correlation across networks in the embedding space. In the Omnibus joint graph embedding framework, previous results delineated the dual effects of algorithm-induced and model-inherent correlations on the total correlation across embedded networks. Accounting for the algorithm-induced correlation is practically important, as suboptimal Omnibus constructions can lead to inferential losses. This work presents the first efforts to automate the Omnibus construction in order to address two key questions: the correlation–to–Omni problem and the flat correlation problem. In the flat correlation problem, we seek the minimum algorithm-induced flat correlation (i.e., the same across all graph pairs) produced via an Omnibus embedding, as minimal flat correlation best preserves individual graph structure in the embedding space. Working in a subspace of the fully general Omnibus matrices, we prove both a lower bound for this flat correlation and that the classical Omnibus construction induces maximal flat correlation. In the correlation–to–Omni problem, we present the corr2Omni algorithm to construct Omnibus embeddings that best preserve a given matrix of estimated pairwise graph correlations in the embedding space. In simulated and real data settings, we demonstrate the increased effectiveness of corr2Omni versus the classical Omnibus construction.
Publisher DOI
Vertex misalignment and changepoint localization in network time series
arXiv (Cornell University) · 2026-04-22
preprintOpen access
Inference for time series of networks often relies on accurate vertex correspondence between network realizations at different times. In practice, however, such vertex alignments can be misspecified or unknown. We study the impact of vertex alignment on changepoint localization for dynamic networks through two illustrative models, each with a similar changepoint, with the key distinction being whether changepoint information is contained in marginal or joint distributions of the time-varying latent positions. We compare localization techniques ranging from the simple network statistic of average degree to the modern procedure of Euclidean mirrors. In one model, vertex misalignment causes little error, and in the other, it impairs localization in ways that cannot be corrected through graph matching or optimal transport, which we show are closely related in this setting. Our results demonstrate that robust network inference necessitates reckoning with the subtle interplay of marginal and joint information in the observed network time series.
Publisher DOI
SIGMA: Scalable Spectral Insights for LLM Model Collapse
arXiv (Cornell University) · 2026-01-06
articleOpen access
The rapid adoption of synthetic data for training Large Language Models (LLMs) has introduced the technical challenge of "model collapse"-a degenerative process where recursive training on model-generated content leads to a contraction of distributional variance and representational quality. While the phenomenology of collapse is increasingly evident, rigorous methods to quantify and predict its onset in high-dimensional spaces remain elusive. In this paper, we introduce SIGMA (Spectral Inequalities for Gram Matrix Analysis), a unified framework that benchmarks model collapse through the spectral lens of the embedding Gram matrix. By deriving and utilizing deterministic and stochastic bounds on the matrix's spectrum, SIGMA provides a mathematically grounded metric to track the contraction of the representation space. Crucially, our stochastic formulation enables scalable estimation of these bounds, making the framework applicable to large-scale foundation models where full eigendecomposition is intractable. We demonstrate that SIGMA effectively captures the transition towards degenerate states, offering both theoretical insights into the mechanics of collapse and a practical, scalable tool for monitoring the health of recursive training pipelines.
Publisher OA PDF
Graph Neural Networks Powered by Encoder Embedding for Improved Node Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence · 2026-01-01
articleSenior author
Graph neural networks (GNNs) have emerged as a powerful framework for a wide range of node-level graph learning tasks. However, their performance typically depends on random or minimally informed initial feature representations, where poor initialization can lead to slower convergence and increased training instability. In this paper, we address this limitation by leveraging a statistically grounded one-hot graph encoder embedding (GEE) as a high-quality, structure-aware initialization for node features. Integrating GEE into standard GNNs yields the GEE-powered GNN (GG) framework. Across extensive simulations and real-world benchmarks, GG provides consistent and substantial performance gains in both unsupervised and supervised settings. For node classification, we further introduce GG-C, which concatenates the outputs of GG and GEE and outperforms competing methods, achieving roughly 10-50% accuracy improvements across most datasets. These results demonstrate the importance of principled, structure-aware initialization for improving the efficiency, stability, and overall performance of graph neural network architecture, enabling models to better exploit graph topology from the outset.
Publisher DOI
Vertex misalignment and changepoint localization in network time series
arXiv (Cornell University) · 2026-04-22
articleOpen access
Inference for time series of networks often relies on accurate vertex correspondence between network realizations at different times. In practice, however, such vertex alignments can be misspecified or unknown. We study the impact of vertex alignment on changepoint localization for dynamic networks through two illustrative models, each with a similar changepoint, with the key distinction being whether changepoint information is contained in marginal or joint distributions of the time-varying latent positions. We compare localization techniques ranging from the simple network statistic of average degree to the modern procedure of Euclidean mirrors. In one model, vertex misalignment causes little error, and in the other, it impairs localization in ways that cannot be corrected through graph matching or optimal transport, which we show are closely related in this setting. Our results demonstrate that robust network inference necessitates reckoning with the subtle interplay of marginal and joint information in the observed network time series.
Publisher OA PDF

Frequent coauthors

Joshua T Vogelstein
Johns Hopkins University
165 shared
Youngser Park
141 shared
Minh Tang
108 shared
Vince Lyzinski
106 shared
Cencheng Shen
81 shared
Jonathan Larson
68 shared
Daniel L. Sussman
Boston University
59 shared
Avanti Athreya
54 shared

Awards & honors

2010 American Statistical Association Distinguished Achievem…
2011 McDonald Award for Excellence in Mentoring and Advising
Vannevar Bush National Security Science and Engineering Facu…

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Carey Priebe

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you