Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Lingzhou Xue

Lingzhou Xue

· ProfessorVerified

Pennsylvania State University · Statistics

Active 2007–2026

h-index23
Citations2.7k
Papers12864 last 5y
Funding$1.7M1 active
See your match with Lingzhou Xue — sign in to PhdFit.Sign in

About

Lingzhou Xue is a Professor of Statistics at Penn State. He received his B.Sc. in Statistics from Peking University in 2008 and his Ph.D. in Statistics from the University of Minnesota in 2012. He was a postdoctoral research associate at Princeton University from 2012-2013. His research interests include high-dimensional statistics, nonparametric statistics, statistical and machine learning, large-scale optimization, and statistical modeling in biomedical, environmental, and social sciences. His recent research focuses on causal inference, federated learning, graphical models, high-dimensional inference, optimal transport, random objects, and reinforcement learning. He is a dedicated mentor to Ph.D. students and postdoctoral researchers, with five of his former advisees becoming tenure-track faculty members in statistics.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Mathematics
  • Statistics
  • Biology
  • Algorithm
  • Geography
  • Waste management
  • Cartography
  • Computational biology
  • Engineering
  • Environmental science
  • Mathematical optimization
  • Bioinformatics
  • Applied mathematics
  • Petroleum engineering

Selected publications

  • Gap-Dependent Bounds for Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation

    Open MIND · 2026-02-23

    preprintSenior author

    We study gap-dependent performance guarantees for nearly minimax-optimal algorithms in reinforcement learning with linear function approximation. While prior works have established gap-dependent regret bounds in this setting, existing analyses do not apply to algorithms that achieve the nearly minimax-optimal worst-case regret bound $\tilde{O}(d\sqrt{H^3K})$, where $d$ is the feature dimension, $H$ is the horizon length, and $K$ is the number of episodes. We bridge this gap by providing the first gap-dependent regret bound for the nearly minimax-optimal algorithm LSVI-UCB++ (He et al., 2023). Our analysis yields improved dependencies on both $d$ and $H$ compared to previous gap-dependent results. Moreover, leveraging the low policy-switching property of LSVI-UCB++, we introduce a concurrent variant that enables efficient parallel exploration across multiple agents and establish the first gap-dependent sample complexity upper bound for online multi-agent RL with linear function approximation, achieving linear speedup with respect to the number of agents.

  • Gap-Dependent Bounds for Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation

    arXiv (Cornell University) · 2026-02-23

    articleOpen accessSenior author

    We study gap-dependent performance guarantees for nearly minimax-optimal algorithms in reinforcement learning with linear function approximation. While prior works have established gap-dependent regret bounds in this setting, existing analyses do not apply to algorithms that achieve the nearly minimax-optimal worst-case regret bound $\tilde{O}(d\sqrt{H^3K})$, where $d$ is the feature dimension, $H$ is the horizon length, and $K$ is the number of episodes. We bridge this gap by providing the first gap-dependent regret bound for the nearly minimax-optimal algorithm LSVI-UCB++ (He et al., 2023). Our analysis yields improved dependencies on both $d$ and $H$ compared to previous gap-dependent results. Moreover, leveraging the low policy-switching property of LSVI-UCB++, we introduce a concurrent variant that enables efficient parallel exploration across multiple agents and establish the first gap-dependent sample complexity upper bound for online multi-agent RL with linear function approximation, achieving linear speedup with respect to the number of agents.

  • Preference-Based Self-Distillation: Beyond KL Matching via Reward Regularization

    arXiv (Cornell University) · 2026-05-06

    preprintOpen access

    On-policy distillation is an efficient alternative to reinforcement learning, offering dense token-level training signals. However, its reliance on a stronger external teacher has driven recent work on on-policy self-distillation, where the same model serves as both teacher and student under different prompt contexts. Yet, existing self-distillation methods largely reduce learning to KL matching toward the context-augmented teacher model. This approach often suffers from training instability and can degrade reasoning performance over time. Moreover, self-distillation from the same model with prompt augmentation lacks the exploratory diversity provided by a genuine external teacher. To address these limitations, we move beyond fixed-teacher KL matching and propose \textbf{P}reference-\textbf{B}ased \textbf{S}elf-\textbf{D}istillation (\textbf{PBSD}), which revisits on-policy self-distillation through a reward-regularized perspective. Instead of directly matching the teacher distribution, we derive a reward-regularized objective whose analytic optimum is a reward-reweighted teacher distribution, yielding a target policy provably superior to the original teacher under this objective. Practically, PBSD optimizes preference gaps between teacher and student samples while maintaining on-policy student sampling. We support this framework with a statistical analysis of the induced preference-learning problem, formally establishing when on policy self-distillation is preferable to learning from an external teacher in our setting. Experiments on mathematical reasoning and tool-use benchmarks across multiple model scales demonstrate that PBSD consistently achieves the strongest average performance among comparable baselines, showing improved training stability over prior self-distillation baselines while preserving token efficiency.

  • A Unified Framework for Nonlinear Mediation Analysis of Random Objects

    arXiv (Cornell University) · 2026-03-30

    articleOpen accessSenior author

    Mediation analysis for complex, non-Euclidean data, such as probability distributions, compositions, images, and networks, presents significant methodological challenges due to the inherent nonlinearity and geometric constraints of such spaces. Existing approaches are often restricted to Euclidean settings or specific data types. We propose Random Object Mediation Analysis (ROMA), a unified framework that simultaneously accommodates object-valued exposures, mediators, and outcomes, enabling the analysis of nonlinear causal pathways in general metric spaces. ROMA leverages an additive Reproducing Kernel Hilbert Space (RKHS) operator model to rigorously disentangle direct and indirect causal pathways, which is a significant advancement over existing single-predictor or purely predictive additive frameworks. Theoretically, we establish the nonparametric identification of causal effects and derive global asymptotic normality for our estimators. Crucially, this theoretical foundation enables the construction of simultaneous confidence bands and global test statistics without the need for computationally intensive resampling. We demonstrate the practical utility of ROMA through simulations and real-world applications involving compositional mediators and distributional outcomes, extending the scope of mediation analysis.

  • Preference-Based Self-Distillation: Beyond KL Matching via Reward Regularization

    ArXiv.org · 2026-05-06

    articleOpen access

    On-policy distillation is an efficient alternative to reinforcement learning, offering dense token-level training signals. However, its reliance on a stronger external teacher has driven recent work on on-policy self-distillation, where the same model serves as both teacher and student under different prompt contexts. Yet, existing self-distillation methods largely reduce learning to KL matching toward the context-augmented teacher model. This approach often suffers from training instability and can degrade reasoning performance over time. Moreover, self-distillation from the same model with prompt augmentation lacks the exploratory diversity provided by a genuine external teacher. To address these limitations, we move beyond fixed-teacher KL matching and propose \textbf{P}reference-\textbf{B}ased \textbf{S}elf-\textbf{D}istillation (\textbf{PBSD}), which revisits on-policy self-distillation through a reward-regularized perspective. Instead of directly matching the teacher distribution, we derive a reward-regularized objective whose analytic optimum is a reward-reweighted teacher distribution, yielding a target policy provably superior to the original teacher under this objective. Practically, PBSD optimizes preference gaps between teacher and student samples while maintaining on-policy student sampling. We support this framework with a statistical analysis of the induced preference-learning problem, formally establishing when on policy self-distillation is preferable to learning from an external teacher in our setting. Experiments on mathematical reasoning and tool-use benchmarks across multiple model scales demonstrate that PBSD consistently achieves the strongest average performance among comparable baselines, showing improved training stability over prior self-distillation baselines while preserving token efficiency.

  • EXACT: Explicit Attribute-Guided Decoding-Time Personalization

    Open MIND · 2026-02-06

    preprintSenior author

    Achieving personalized alignment requires adapting large language models to each user's evolving context. While decoding-time personalization offers a scalable alternative to training-time methods, existing methods largely rely on implicit, less interpretable preference representations and impose a rigid, context-agnostic user representation, failing to account for how preferences shift across prompts. We introduce EXACT, a new decoding-time personalization that aligns generation with limited pairwise preference feedback using a predefined set of interpretable attributes. EXACT first identifies user-specific attribute subsets by maximizing the likelihood of preferred responses in the offline stage. Then, for online inference, EXACT retrieves the most semantically relevant attributes for an incoming prompt and injects them into the context to steer generation. We establish theoretical approximation guarantees for the proposed algorithm under mild assumptions, and provably show that our similarity-based retrieval mechanism effectively mitigates contextual preference shifts, adapting to disparate tasks without pooling conflicting preferences. Extensive experiments on human-annotated preference datasets demonstrate that EXACT consistently outperforms strong baselines, including preference modeling accuracy and personalized generation quality.

  • EXACT: Explicit Attribute-Guided Decoding-Time Personalization

    arXiv (Cornell University) · 2026-02-06

    articleOpen accessSenior author

    Achieving personalized alignment requires adapting large language models to each user's evolving context. While decoding-time personalization offers a scalable alternative to training-time methods, existing methods largely rely on implicit, less interpretable preference representations and impose a rigid, context-agnostic user representation, failing to account for how preferences shift across prompts. We introduce EXACT, a new decoding-time personalization that aligns generation with limited pairwise preference feedback using a predefined set of interpretable attributes. EXACT first identifies user-specific attribute subsets by maximizing the likelihood of preferred responses in the offline stage. Then, for online inference, EXACT retrieves the most semantically relevant attributes for an incoming prompt and injects them into the context to steer generation. We establish theoretical approximation guarantees for the proposed algorithm under mild assumptions, and provably show that our similarity-based retrieval mechanism effectively mitigates contextual preference shifts, adapting to disparate tasks without pooling conflicting preferences. Extensive experiments on human-annotated preference datasets demonstrate that EXACT consistently outperforms strong baselines, including preference modeling accuracy and personalized generation quality.

  • Statistical inference for high-dimensional robust linear regression models via recursive online-score estimation

    Science China Mathematics · 2026-02-10

    articleSenior authorCorresponding
  • OBE-Oriented Teaching Reform and Practice of the “Industry-Academia-Research-Competition-Innovation” Integrated Model in Signals and Systems Course—A Case Study of a “Brain-Computer Interface Project-Based Course”

    Advances in Education · 2026-01-01

    article1st authorCorresponding
  • A Unified Framework for Nonlinear Mediation Analysis of Random Objects

    arXiv (Cornell University) · 2026-03-30

    preprintOpen accessSenior author

    Mediation analysis for complex, non-Euclidean data, such as probability distributions, compositions, images, and networks, presents significant methodological challenges due to the inherent nonlinearity and geometric constraints of such spaces. Existing approaches are often restricted to Euclidean settings or specific data types. We propose Random Object Mediation Analysis (ROMA), a unified framework that simultaneously accommodates object-valued exposures, mediators, and outcomes, enabling the analysis of nonlinear causal pathways in general metric spaces. ROMA leverages an additive Reproducing Kernel Hilbert Space (RKHS) operator model to rigorously disentangle direct and indirect causal pathways, which is a significant advancement over existing single-predictor or purely predictive additive frameworks. Theoretically, we establish the nonparametric identification of causal effects and derive global asymptotic normality for our estimators. Crucially, this theoretical foundation enables the construction of simultaneous confidence bands and global test statistics without the need for computationally intensive resampling. We demonstrate the practical utility of ROMA through simulations and real-world applications involving compositional mediators and distributional outcomes, extending the scope of mediation analysis.

Recent grants

Frequent coauthors

Labs

  • Department of StatisticsPI

Education

  • Ph.D., Statistics

    University of Minnesota

    2012

Awards & honors

  • Penn State Schreyer Honors College (SHC) Excellence in Advis…
  • Institute of Mathematical Statistics (IMS) Fellow, 2024
  • Penn State Huck Institutes Leadership Fellow, 2024
  • American Statistical Association (ASA) Fellow, 2023
  • National Institute of Statistical Sciences (NISS) Distinguis…
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Lingzhou Xue

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup