Guanhua Chen

· Associate ProfessorVerified

University of Wisconsin-Madison · Biostatistics and Medical Informatics

Active 1991–2025

h-index40

Citations5.8k

Papers269166 last 5y

Funding$600k

Faculty page Lab page

See your match with Guanhua Chen — sign in to PhdFit.Sign in

About

Guanhua Chen is an Associate Professor at the University of Wisconsin–Madison, affiliated with the Department of Biostatistics and Medical Informatics. His expertise lies in the development of statistical and machine learning methods for clinical and biomedical research. He focuses on uncovering intricate patterns in high-dimensional data from various sources, including genomics and electronic health records, to promote precision medicine. His technical interests include causal inference, reinforcement learning, and empirical process. Chen is also an Affiliate Faculty member in the Department of Statistics, the College of Letters and Science, and the Computer, Data & Information Sciences at UW–Madison.

Research topics

Artificial Intelligence
Computer Science
Medicine
Econometrics
Virology
Telecommunications
Mathematics
Internal medicine
Genetics
Environmental health
Bioinformatics
Physics
Statistics
Biology
Cardiology
Geography

Selected publications

Multi-agent Cooperative Encirclement Method Based on Reinforcement Learning
2025-08-22
article1st authorCorresponding
This paper introduces a reinforcement learning-based multi-agent cooperative encirclement method to tackle target tracking and encirclement in dynamic stochastic environments. The Soft Actor-Critic (SAC) algorithm is first employed as the backbone algorithm for multi-agent policy learning, wherein the spatial consistency, task completion, and cooperative constraints—alongside rewards for obstacle avoidance, formation angles, pursuit, and stability are taken into account in a conjunctive manner, which not only allievate the sparse reward problem but also strengthens the robustness against perception uncertainty. Simulations results show that agents are capable of forming a plausible encircling trajectory in 10 seconds (±0.1m distance fluctuation, <10° angle deviation), and recovering stability in 12 seconds. A semi physical test is carried out to validate the robustness of this method, demonstrating the effectiveness in terms of environmental adaptability and rea-time operation.
Publisher DOI
An Information-Theoretic Perspective on Multi-LLM Uncertainty Estimation
medRxiv · 2025-07-10
preprintOpen access
Abstract Large language models (LLMs) often behave inconsistently across inputs, indicating uncertainty and motivating the need for its quantification in high-stakes settings. Prior work on calibration and uncertainty quantification often focuses on individual models, overlooking the potential of model diversity. We hypothesize that LLMs make complementary predictions due to differences in training and the Zipfian nature of language, and that aggregating their outputs leads to more reliable uncertainty estimates. To leverage this, we propose MUSE (Multi-LLM Uncertainty via Subset Ensembles), a simple information-theoretic method that uses Jensen-Shannon Divergence to identify and aggregate well-calibrated subsets of LLMs. Experiments on binary prediction tasks demonstrate improved calibration and predictive performance compared to single-model and naïve ensemble baselines.
Publisher OA PDF DOI
The neuronal and synaptic representations of spatial release from masking in the rat auditory cortex
Frontiers in Neuroscience · 2025-05-14
articleOpen access1st author
In complex acoustic environments, both humans and animals are frequently exposed to sounds from multiple sources. The detection threshold for a target sound (or probe) can be elevated by interference sounds (masker) originating from various locations. This masking effect is reduced when the probe and masker are spatially separated compared to when they are colocalized, thereby improving the perception of the probe. This phenomenon is known as spatial release from masking. Currently, the neuronal and synaptic mechanisms underlying spatial release from masking in the auditory cortex are not fully understood. Here we employed single-unit recording and in vivo whole-cell patch-clamp recording techniques to examine how maskers from different spatial locations influence the detection thresholds of rat primary auditory cortex (A1) neurons in response to probe stimuli. At the cortical neuronal level, the masked detection thresholds of most A1 neurons in response to probes were significantly decreased when maskers were displaced from azimuths colocalized with the probe to other separated azimuths ipsilateral to the recording site. Similarly, at the cortical synaptic level, the masked detection thresholds of A1 neurons, as determined from the amplitude of evoked excitatory postsynaptic currents in response to probes presented at azimuth locations within the contralateral hemifield, were also decreased when maskers were shifted from azimuth locations in the contralteral hemifield to those in the ipsilateral hemifield. This study provides neuronal and synaptic evidences for spatial release from masking in the auditory cortex, advancing our understanding of the mechanisms involved in auditory signal processing in noisy environments.
Publisher OA PDF DOI
Not All LoRA Parameters Are Essential: Insights on Inference Necessity
ArXiv.org · 2025-03-30
preprintOpen access1st authorCorresponding
Current research on LoRA primarily focuses on minimizing the number of fine-tuned parameters or optimizing its architecture. However, the necessity of all fine-tuned LoRA layers during inference remains underexplored. In this paper, we investigate the contribution of each LoRA layer to the model's ability to predict the ground truth and hypothesize that lower-layer LoRA modules play a more critical role in model reasoning and understanding. To address this, we propose a simple yet effective method to enhance the performance of large language models (LLMs) fine-tuned with LoRA. Specifically, we identify a ``boundary layer'' that distinguishes essential LoRA layers by analyzing a small set of validation samples. During inference, we drop all LoRA layers beyond this boundary. We evaluate our approach on three strong baselines across four widely-used text generation datasets. Our results demonstrate consistent and significant improvements, underscoring the effectiveness of selectively retaining critical LoRA layers during inference.
Publisher OA PDF DOI
Standardization and interpretable analysis of geological database using retrieval-augmented large language model
Geodata and AI. · 2025-11-30 · 1 citations
articleOpen access
• The first expert-validated geological database in Macao is established. • A domain-adapted RAG-based LLM framework is developed for database standardization. • Interpretability analysis connects retrievals to model decisions. • Interpretability guides knowledge inputs for human-AI interaction. Urban geological characterization requires standardizing heterogeneous borehole data subject to interpretive variability from engineering practices. Current linguistic models struggle with dynamic yet limited geological datasets and lack transparent interpretation. This study develops a novel framework that incorporates large language models (LLMs) for intelligent formation-type standardization for urban geological databases. The Macao geological database, comprising 100 boreholes from 21 construction projects, has been established by engineering geologists. Input strategies, model uncertainty, and prediction states are analyzed to reveal performance-semantic relationships, optimizing expert-computer interaction for enhanced performance. Overall, the key contributions are as follows: (1) An expert-validated geological database across Macao is first developed for benchmarking; (2) A domain-adapted retrieval-augmented LLM framework is proposed for geological standardization; (3) Interpretability analysis is performed to link retrievals to model behaviors; (4) Model interpretability further guides further knowledge inputs to enhance performance. The study addresses critical gaps in enhancing transparent LLMs, supporting reliable human-AI collaboration for advanced geotechnical applications.
Publisher DOI
Growth performance, bone mineralization, and mineral transporter gene expression in broiler chickens fed varying non-phytate phosphorus levels at a fixed calcium to non-phytate phosphorus ratio
Poultry Science · 2025-10-25 · 1 citations
articleOpen access
This study investigated the effects of dietary non-phytate phosphorus (NPP) levels on growth performance, bone mineralization, and mineral transporting gene expression in broilers (1-21 d), under a 2:1 calcium (Ca)-to-NPP ratio. The six Ca:NPP levels were 0.90 %:0.45 %, 0.80 %:0.40 %, 0.70 %:0.35 %, 0.60 %:0.30 %, 0.50 %:0.25 %, and 0.40 %:0.20 %, respectively. A total of 360 male broilers (day 1) were randomly assigned to 6 groups (5 cages per group, 12 birds per cage). Feed intake and weight gain (WG) were not affected by reductions in NPP from 0.45 % to 0.25 % (at a 2:1 Ca-to-NPP ratio) (P > 0.05), but WG was reduced in birds receiving 0.20 % NPP relative to those receiving 0.45 % NPP (P < 0.05). Femur and tibia quality were not altered by decreasing NPP from 0.45 % to 0.40 %, but further reduction to 0.20 % resulted in declines in bone weight and phosphorus (P) content (P < 0.05). Transcription of duodenal P transporter genes (NaPi-IIb and PiT-1), Ca transporter genes (NCX1, PMCA1b, and CaBP-D28k), as well as renal NaPi-IIa and CaBP-D28k, were upregulated in response to reduced dietary NPP and Ca levels (P < 0.05). In the duodenum, PiT-1 and NaPi-IIb transcription levels were elevated in birds receiving 0.20 %-0.40 % NPP relative to those receiving 0.45 % NPP (P < 0.05). Similarly, NCX1, PMCA1b, and CaBP-D28k transcription levels were increased in birds fed 0.40 %-0.50 % Ca relative to those fed 0.90 % Ca (P < 0.05). In the kidney, NaPi-IIa transcription levels were higher in birds fed 0.20 %-0.25 % NPP than in those receiving 0.45 % NPP (P < 0.05), whereas CaBP-D28k expression increased in birds receiving 0.40 %-0.60 % Ca relative to those receiving 0.90 % Ca (P < 0.05). These findings suggest that, under a constant Ca:NPP ratio of 2:1, moderate reductions in Ca and NPP do not impair growth performance of broilers (1-21 d) and may enhance mineral absorption via transcriptional up-regulation of transporter genes in the intestine and kidney.
Publisher DOI
Automating Evaluation of AI Text Generation in Healthcare with a Large Language Model (LLM)-as-a-Judge
medRxiv · 2025-04-22 · 16 citations
preprintOpen access
Electronic Health Records (EHRs) store vast amounts of clinical information that are difficult for healthcare providers to summarize and synthesize relevant details to their practice. To reduce cognitive load on providers, generative AI with Large Language Models have emerged to automatically summarize patient records into clear, actionable insights and offload the cognitive burden for providers. However, LLM summaries need to be precise and free from errors, making evaluations on the quality of the summaries necessary. While human experts are the gold standard for evaluations, their involvement is time-consuming and costly. Therefore, we introduce and validate an automated method for evaluating real-world EHR multi-document summaries using an LLM as the evaluator, referred to as LLM-as-a-Judge. Benchmarking against the validated Provider Documentation Summarization Quality Instrument (PDSQI)-9 for human evaluation, our LLM-as-a-Judge framework demonstrated strong inter-rater reliability with human evaluators. GPT-o3-mini achieved the highest intraclass correlation coefficient of 0.818 (95% CI 0.772, 0.854), with a median score difference of 0 from human evaluators, and completes evaluations in just 22 seconds. Overall, the reasoning models excelled in inter-rater reliability, particularly in evaluations that require advanced reasoning and domain expertise, outperforming non-reasoning models, those trained on the task, and multi-agent workflows. Cross-task validation on the Problem Summarization task similarly confirmed high reliability. By automating high-quality evaluations, medical LLM-as-a-Judge offers a scalable, efficient solution to rapidly identify accurate and safe AI-generated summaries in healthcare settings.
Publisher DOI
Triple adjuvant therapy with transarterial chemoembolization, lenvatinib, and programmed death-1 inhibitors improves short-term recurrence control in high-risk patients with resected intermediate-stage hepatocellular carcinoma
International Journal of Surgery · 2025-11-20
articleOpen access1st authorCorresponding
BACKGROUND: Intermediate-stage hepatocellular carcinoma (HCC) following curative liver resection (LR) is associated with high recurrence rates and poor survival outcomes. Current studies have found that transarterial chemoembolization can improve overall survival rates and disease free survival in patients with intermediate-stage HCC after surgery. However, the benefits of this treatment are limited. This study aimed to evaluate the benefit of triple adjuvant therapy-transarterial chemoembolization (TACE) combined with antiangiogenic therapy (lenvatinib) plus programmed death-1 inhibitors (TAP)-as an adjuvant treatment for resected intermediate-stage HCC, compared to TACE alone, and to identify patient subgroups most likely to benefit from the TAP regimen. MATERIALS AND METHODS: We collected data of patients with intermediate-stage HCC who underwent LR from December 2019 to December 2022. Disease-free survival (DFS) was compared between patients receiving TACE and those receiving TAP using propensity score matching. The 2-year recurrence rate in the entire cohort was predicted based on the TACE group, and the association between the predicted and observed recurrences was tested. OUTCOMES: A total of 571 patients were included in our study, with 102 receiving TAP and 469 receiving TACE. Compared with TACE alone, TAP showed better DFS (HR: 0.74; 95% CI: 0.56-0.98; P = 0.037), with median: 22.0 months (95% CI: 19.0-24.0) vs 25.6 months (95% CI: 24.0-40.0). The lines for the TACE and TAP intersected at 37% indicating that patients with a predicted 2-year recurrence risk >37% would significantly benefit from TAP. TAP therapy demonstrated a manageable AE profile, an overall AEs of 79.4%, with grade 1-2 accounting for 65.7%, grade 3 for 11.8%, and grade 4 for 1.9%. CONCLUSIONS: TAP therapy demonstrated significant potential as an adjuvant treatment for intermediate-stage HCC following curative resection, offering superior recurrence control and survival benefits compared to TACE alone. Patients with a predicted recurrence risk >37% showed improved DFS outcomes in TAP therapy, suggesting that recurrence risk thresholds could guide tailored treatment decisions in clinical practice. The manageable safety profile of TAP further supports its feasibility in the postoperative setting.Our findings represent a substantial advancement in the field of adjuvant therapy for resected intermediate-stage HCC.
Publisher DOI
MoMA: a mixture-of-multimodal-agents architecture for enhancing clinical prediction modelling
npj Digital Medicine · 2025-12-09 · 1 citations
articleOpen access
Multimodal electronic health record (EHR) data provide richer, complementary insights into patient health compared to single-modality data. However, effectively integrating diverse data modalities for clinical prediction modeling remains challenging due to the substantial data requirements. We introduce a novel architecture, Mixture-of-Multimodal-Agents (MoMA), designed to leverage multiple large language model (LLM) agents for clinical prediction tasks using multimodal EHR data. MoMA employs specialized LLM agents ("specialist agents") to convert non-textual modalities, such as medical images and laboratory results, into structured textual summaries. These summaries, together with clinical notes, are combined by another LLM ("aggregator agent") to generate a unified multimodal summary, which is then used by a third LLM ("predictor agent") to produce clinical predictions. Evaluating MoMA with different modality combinations and prediction settings, MoMA outperforms existing methods on three prediction tasks using private datasets, highlighting its enhanced accuracy and flexibility across various tasks.
Publisher OA PDF DOI
A novel prognostic nomogram to predict survival of patients with esophageal squamous cell carcinoma after definitive chemoradiotherapy
Journal of Thoracic Disease · 2025-08-01
articleOpen access1st authorCorresponding
Background: Against the backdrop of esophageal cancer's high global incidence, the dominant role of esophageal squamous cell carcinoma (ESCC) with poor prognosis, limited surgical opportunities, and the American Joint Committee on Cancer (AJCC) staging system's insufficiency, there is an urgent need to develop a prognostic nomogram for ESCC patients undergoing chemoradiotherapy. The purpose of this study was to establish a clinical nomogram for effectively predicting overall survival (OS) for patients with non-operated ESCC after definitive chemoradiotherapy. Methods: A total of 869 patients diagnosed with ESCC from 2010 to 2015 were retrieved from the Surveillance, Epidemiology, and End Results database. The nomogram was developed based on independent predictors determined by multivariate Cox regression analyses. Additional external validation was conducted on 318 ESCC patients enrolled from The First Affiliated Hospital of Nanjing Medical University. The receiver operating characteristic curve analysis and calibration plot were utilized to assess the predictive discriminative ability and reliability of the nomogram in both the training cohort and external validation cohort. The clinical practicability was evaluated by decision curve analysis and further comparing the novel model and the eighth edition of the American Joint Committee on Cancer (AJCC) staging system. Results: The multivariate analysis of the training cohort suggested that age, sex, tumor site, tumor size, clinical T stage, clinical N stage were significantly associated with OS and were all incorporated into the nomogram. The results suggested that the novel nomogram performed well with good discrimination and agreement and exhibited more optimal clinical benefits than AJCC 8th staging system. Meanwhile, an online web-server based on the new nomogram was developed for convenient clinical practice. Conclusions: The prognostic nomogram developed in this study demonstrates favorable predictive performance for survival outcomes in ESCC patients receiving definitive chemoradiotherapy. Its discriminative ability, consistency, and clinical benefits surpass those of the AJCC 8th Edition staging system. Additionally, a convenient online tool for the nomogram has been developed. This model can objectively quantify patients' survival risks, providing critical reference for the formulation of individualized treatment strategies and possessing clinical application value.
Publisher DOI

Recent grants

DMS/NIGMS 2: Unraveling the Role of the Human Microbiome to Advance Precision Medicine
NSF · $600k · 2021–2025

Frequent coauthors

Jifan Gao
University of Wisconsin–Madison
61 shared
Thomas Yu
51 shared
Ivan Brugere
51 shared
Sean D. Mooney
National Institutes of Health
51 shared
Yonghwa Choi
Korea University
50 shared
Jaewoo Kang
50 shared
Łukasz Charzewski
University of Warsaw
50 shared
Renata Retkutė
University of Cambridge
50 shared

Education

Ph.D., Biostatistics
University of Wisconsin-Madison
2004
M.S., Biostatistics
University of Wisconsin-Madison
2001
B.S., Mathematics
University of Science and Technology of China
1998

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Guanhua Chen

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you