Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Quanquan Gu

Quanquan Gu

· ProfessorVerified

University of California, Los Angeles · Computer Science

Active 1991–2026

h-index53
Citations10.7k
Papers414228 last 5y
Funding$3.9M
See your match with Quanquan Gu — sign in to PhdFit.Sign in

About

Quanquan Gu is an Associate Professor in the Department of Computer Science at UCLA Samueli School of Engineering. His research interests include machine learning, data mining, optimization algorithms, and computational genomics. He has contributed to the development of theoretical and practical methods for high-dimensional nonparametric graphical models, distributed estimation, deep neural network training, and stochastic optimization techniques. Gu's work often focuses on understanding the convergence, robustness, and generalization properties of algorithms used in nonconvex and over-parameterized settings, with applications spanning from neural networks to genomic data analysis.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Machine Learning
  • Medicine
  • Business
  • Political Science
  • Engineering
  • Geography
  • Econometrics
  • Mathematics
  • Economics
  • Sociology
  • Computer Security
  • Operations research
  • Mathematical optimization
  • Actuarial science
  • Data Mining
  • World Wide Web
  • Virology
  • Meteorology
  • Psychology
  • Environmental health
  • Statistics
  • Data science

Selected publications

  • EIS-GPT: Transformer-based generation of equivalent circuit for electrochemical impedance spectroscopy

    Zenodo (CERN European Organization for Nuclear Research) · 2026-05-03

    articleOpen access

    Electrochemical impedance spectroscopy (EIS) is widely used to probe electrochemical systems, with impedance spectra commonly interpreted through regression to equivalent circuit models (ECMs). However, ECM identification remains challenging due to structural non-uniqueness and the reliance on expert-driven model selection. Here, we present EIS-GPT, a transformer-based generative framework for ECM prediction that represents circuits as directed acyclic graphs and constructs candidate topologies autoregressively from impedance data. Reinforcement-learning-based fine-tuning is adapted to guide topology generation through reward functions that favor accurate spectral description while accounting for physical plausibility and model complexity. The framework produces candidate circuits that provide competitive spectral explanations while avoiding overfitting. By formulating ECM discovery as structure-aware generative optimization rather than categorical selection, EIS-GPT provides a scalable and transparent strategy for automated impedance model discovery across diverse electrochemical systems.

  • EIS-GPT: Transformer-based generation of equivalent circuit for electrochemical impedance spectroscopy

    Zenodo (CERN European Organization for Nuclear Research) · 2026-05-03

    articleOpen access

    Electrochemical impedance spectroscopy (EIS) is widely used to probe electrochemical systems, with impedance spectra commonly interpreted through regression to equivalent circuit models (ECMs). However, ECM identification remains challenging due to structural non-uniqueness and the reliance on expert-driven model selection. Here, we present EIS-GPT, a transformer-based generative framework for ECM prediction that represents circuits as directed acyclic graphs and constructs candidate topologies autoregressively from impedance data. Reinforcement-learning-based fine-tuning is adapted to guide topology generation through reward functions that favor accurate spectral description while accounting for physical plausibility and model complexity. The framework produces candidate circuits that provide competitive spectral explanations while avoiding overfitting. By formulating ECM discovery as structure-aware generative optimization rather than categorical selection, EIS-GPT provides a scalable and transparent strategy for automated impedance model discovery across diverse electrochemical systems.

  • EIS-GPT: Transformer-based generation of equivalent circuit for electrochemical impedance spectroscopy

    ChemRxiv · 2026-05-10

    articleOpen access

    Electrochemical impedance spectroscopy (EIS) is widely used to probe electrochemical systems, with impedance spectra commonly interpreted through regression to equivalent circuit models (ECMs). However, ECM identification remains challenging due to structural non-uniqueness and the reliance on expert-driven model selection. Here, we present EIS-GPT, a transformer-based generative framework for ECM prediction that represents circuits as directed acyclic graphs and constructs candidate topologies autoregressively from impedance data. Reinforcement-learning-based fine-tuning is adapted to guide topology generation through reward functions that favor accurate spectral description while accounting for physical plausibility and model complexity. The framework produces candidate circuits that provide competitive spectral explanations while avoiding overfitting. By formulating ECM discovery as structure-aware generative optimization rather than categorical selection, EIS-GPT provides a scalable and transparent strategy for automated impedance model discovery across diverse electrochemical systems.

  • Intelligent design and application of molecular recognition hydrogels in tissue engineering

    Materials Today Bio · 2026-03-19

    articleOpen access

    Molecular recognition hydrogels are functional materials that form three-dimensional network structures through molecular-specific binding, such as antigen-antibody and ligand-receptor interactions. They exhibit excellent biocompatibility, tunable mechanical properties, and dynamic responsiveness. Furthermore, these hydrogels can simulate the microenvironment of the natural extracellular matrix and precisely regulate cell behavior, thereby broadening their application in tissue engineering. Herein, we review the definition of molecular recognition hydrogels; available crosslinking strategies, including physical, chemical, and biological crosslinking; and examples of representative systems, such as DNA hydrogels and heparin-binding protein hydrogels. We also summarize the latest research on molecular recognition hydrogels for drug delivery, biosensing, and tissue repair and regeneration (including bone, cartilage, skin, heart, and neural tissues). In addition, we discuss the challenges (such as stability and targeting) and future development directions in this field, providing new ideas for precision and personalized treatments in tissue engineering.

  • Biodegradable natural hydrogels: Design, crosslinking, and medical applications

    Materials Today Communications · 2025-05-26 · 5 citations

    article
  • On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning

    arXiv (Cornell University) · 2025-05-23

    preprintOpen access

    Policy gradient algorithms have been successfully applied to enhance the reasoning capabilities of large language models (LLMs). KL regularization is ubiquitous, yet the design surface, choice of KL direction (forward vs. reverse), normalization (normalized vs. unnormalized), and estimator ($k_1/k_2/k_3$), is scattered across the literature and often intertwined with off-policy estimation. We ask a focused question: under the off-policy setting, what weighting is required for each KL variant so that the surrogate we optimize yields the exact gradient of the intended KL-regularized objective? We answer this with a compact, unified derivation we call the Regularized Policy Gradient (RPG) view. RPG (i) unifies normalized and unnormalized KL variants and shows that the widely-used $k_3$ penalty is exactly the unnormalized KL; (ii) specifies conditions under which REINFORCE-style losses with stop-gradient are gradient-equivalent to fully differentiable surrogates; (iii) identifies and corrects an off-policy importance-weighting mismatch in GRPO's KL term; and (iv) introduces RPG-Style Clip, a clipped-importance-sampling step within RPG-REINFORCE that enables stable, off-policy policy-gradient training at scale. On mathematical reasoning benchmarks (AIME24, AIME25), RPG-REINFORCE with RPG-Style Clip improves accuracy by up to $+6$ absolute percentage points over DAPO. We extend our experiments to 8K context length, and RPG-REINFORCE with RPG-Style Clip achieves 52% accuracy on AIME25, surpassing the official Qwen3-4B-Instruct model (47%). Notably, RPG is a stable and scalable RL algorithm for LLM reasoning, realized via (a) a KL-correct objective, (b) clipped importance sampling, and (c) an iterative reference-policy update scheme. Project Page: https://github.com/complex-reasoning/RPG.

  • Robust Layerwise Scaling Rules by Proper Weight Decay Tuning

    ArXiv.org · 2025-10-17

    preprintOpen accessSenior author

    Empirical scaling laws prescribe how to allocate parameters, data, and compute, while maximal-update parameterization ($μ$P) enables learning-rate transfer across widths by equalizing early-time update magnitudes. However, in modern scale-invariant architectures, training quickly enters an optimizer-governed steady state where normalization layers create backward scale sensitivity and the effective learning rate becomes width dependent, degrading $μ$P transfer. We address this by introducing a weight-decay scaling rule for AdamW that preserves sublayer gain across widths. Empirically, the singular-value spectrum of each matrix parameter scales in norm as $\sqrt{η/λ}$ with an approximately invariant shape; under width scaling $d$, we observe that the top singular value scales approximately as $\sqrt{η/λ}\cdot d^{0.75}$. Combining this observation with the $μ$P learning-rate rule $η_2\propto d^{-1}$ for matrix-like parameters implies an empirical weight-decay scaling rule $λ_2\propto \sqrt{d}$ that approximately keeps sublayer gains width invariant. Together with vector-like parameters trained at $η_1=Θ_d(1)$ and $λ_1=0$, this yields \emph{zero-shot} transfer of both learning rate and weight decay from proxy to target widths, removing per-width sweeps. We validate the rule on LLaMA-style Transformers and in a minimal synthetic setting, and we provide a simple diagnostic, matching top singular values, to check sublayer-gain invariance. Our results extend $μ$P beyond the near-init regime by explicitly controlling steady-state scales set by the optimizer, offering a practical recipe for width-robust hyperparameter transfer under AdamW.

  • Impairments in Functional Connectivity and Glymphatic System in Breast Cancer Patients Undergoing Treatment

    Proceedings on CD-ROM - International Society for Magnetic Resonance in Medicine. Scientific Meeting and Exhibition/Proceedings of the International Society for Magnetic Resonance in Medicine, Scientific Meeting and Exhibition · 2025-09-16

    article1st authorCorresponding

    Motivation: Cancer patients may have declined glymphatic function and functional connectivity due to treatment effcts. Goal(s): Aiming to explore the association between glymphatic dysfunction and DMN-based FC pattern in treated breast cancer (BC) patients. Approach: We combined seed-based analysis of DMN and DTI data-derived index of diffusivity along the perivascular space (ALPS-index) to explore alterations in FC and glymphatic function. Pearson's correlation was performed to examine the association between altered DMN-based FC and ALPS-index in BC group. Results: Our study demonstrated altered DMN-based FC and declined ALPS-index in BC patients. Lower right-hemisphere ALPS-index in BC patients was linked to higher posterior cingulate cortex-precuneus FC. Impact: Altered DMN-based functional connectivity and bilateral ALPS-index, as well as the negative correlation between elevated posterior cingulate cortex-precuneus connectivity and ALPS-index in BC, indicate a potential mechanism overcoming glymphatic dysfunction by enhancing functional interactions.

  • Gut microbiome predicts personalized responses to dietary fiber in prediabetes: a randomized, open-label trial

    Nature Communications · 2025-12-13 · 4 citations

    articleOpen access

    Gut microbiota contributes to prediabetes progression, however, whether microbiota features can guide targeted prevention and treatment for diabetes requires validation through large-scale clinical trials. Here, in a randomized, open-label trial, we randomly assigned 802 prediabetic subjects to a usual care control group (patient education and dietary recommendations, n = 393) or a dietary fiber intervention group (n = 409) for 6 months. The primary outcome was the percentage change in whole-blood HbA1c, and secondary outcomes were the changes in other glucose, insulin, lipid, liver and kidney function, and anthropometric parameters. There were no statistically significant differences in the primary and secondary outcomes between groups. In post-hoc analysis, we reclassified subjects into four clusters using a multivariate clustering model based on age, BMI, HbA1c, HOMA2-IR and HOMA2-B. These clusters differed in metabolic status, risks of diabetes and its complications, gut microbiome and serum metabolites. Notably, dietary fiber improved glycemic control in Clusters 3 and 4, but not in Clusters 1 and 2, consistent with observed gut microbiota alleviations. By using a LightGBM machine learning model, we calculated a microbiome-based clinical decision score to predict personalized fiber intervention responses and identified individuals who can get glycemic benefits. In conclusion, our study suggests that the gut microbiota response influences the effectiveness of dietary fiber intervention and provides a clinically applicable model to guide microbiome-targeted personalized medicine for prediabetes. Clinical Trial Registry: ChiCTR1900027663.

  • Higher-order Linear Attention

    ArXiv.org · 2025-10-31

    preprintOpen access

    The quadratic cost of scaled dot-product attention is a central obstacle to scaling autoregressive language models to long contexts. Linear-time attention and State Space Models (SSMs) provide scalable alternatives but are typically restricted to first-order or kernel-based approximations, which can limit expressivity. We introduce Higher-order Linear Attention (HLA), a causal, streaming mechanism that realizes higher interactions via compact prefix sufficient statistics. In the second-order case, HLA maintains a constant-size state and computes per-token outputs in linear time without materializing any $n \times n$ matrices. We give closed-form streaming identities, a strictly causal masked variant using two additional summaries, and a chunk-parallel training scheme based on associative scans that reproduces the activations of a serial recurrence exactly. We further outline extensions to third and higher orders. Collectively, these results position HLA as a principled, scalable building block that combines attention-like, data-dependent mixing with the efficiency of modern recurrent architectures.

Recent grants

Frequent coauthors

  • Dongruo Zhou

    90 shared
  • Pan Xu

    61 shared
  • Difan Zou

    59 shared
  • Jinghui Chen

    55 shared
  • Lingxiao Wang

    34 shared
  • Weitong Zhang

    University of Colorado System

    34 shared
  • Ziyan Yang

    Northeastern University

    28 shared
  • Yuan Cao

    Zhejiang University

    27 shared
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Quanquan Gu

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup