
Guanhua Chen
· Associate ProfessorVerifiedUniversity of Wisconsin-Madison · Biostatistics and Medical Informatics
Active 1991–2025
About
Guanhua Chen is an Associate Professor at the University of Wisconsin–Madison, affiliated with the Department of Biostatistics and Medical Informatics. His expertise lies in the development of statistical and machine learning methods for clinical and biomedical research. He focuses on uncovering intricate patterns in high-dimensional data from various sources, including genomics and electronic health records, to promote precision medicine. His technical interests include causal inference, reinforcement learning, and empirical process. Chen is also an Affiliate Faculty member in the Department of Statistics, the College of Letters and Science, and the Computer, Data & Information Sciences at UW–Madison.
Research topics
- Artificial Intelligence
- Computer Science
- Medicine
- Econometrics
- Virology
- Telecommunications
- Mathematics
- Internal medicine
- Genetics
- Environmental health
- Bioinformatics
- Physics
- Statistics
- Biology
- Cardiology
- Geography
Selected publications
Multi-agent Cooperative Encirclement Method Based on Reinforcement Learning
2025-08-22
article1st authorCorrespondingThis paper introduces a reinforcement learning-based multi-agent cooperative encirclement method to tackle target tracking and encirclement in dynamic stochastic environments. The Soft Actor-Critic (SAC) algorithm is first employed as the backbone algorithm for multi-agent policy learning, wherein the spatial consistency, task completion, and cooperative constraints—alongside rewards for obstacle avoidance, formation angles, pursuit, and stability are taken into account in a conjunctive manner, which not only allievate the sparse reward problem but also strengthens the robustness against perception uncertainty. Simulations results show that agents are capable of forming a plausible encircling trajectory in 10 seconds (±0.1m distance fluctuation, <10° angle deviation), and recovering stability in 12 seconds. A semi physical test is carried out to validate the robustness of this method, demonstrating the effectiveness in terms of environmental adaptability and rea-time operation.
An Information-Theoretic Perspective on Multi-LLM Uncertainty Estimation
medRxiv · 2025-07-10
preprintOpen accessAbstract Large language models (LLMs) often behave inconsistently across inputs, indicating uncertainty and motivating the need for its quantification in high-stakes settings. Prior work on calibration and uncertainty quantification often focuses on individual models, overlooking the potential of model diversity. We hypothesize that LLMs make complementary predictions due to differences in training and the Zipfian nature of language, and that aggregating their outputs leads to more reliable uncertainty estimates. To leverage this, we propose MUSE (Multi-LLM Uncertainty via Subset Ensembles), a simple information-theoretic method that uses Jensen-Shannon Divergence to identify and aggregate well-calibrated subsets of LLMs. Experiments on binary prediction tasks demonstrate improved calibration and predictive performance compared to single-model and naïve ensemble baselines.
The neuronal and synaptic representations of spatial release from masking in the rat auditory cortex
Frontiers in Neuroscience · 2025-05-14
articleOpen access1st authorIn complex acoustic environments, both humans and animals are frequently exposed to sounds from multiple sources. The detection threshold for a target sound (or probe) can be elevated by interference sounds (masker) originating from various locations. This masking effect is reduced when the probe and masker are spatially separated compared to when they are colocalized, thereby improving the perception of the probe. This phenomenon is known as spatial release from masking. Currently, the neuronal and synaptic mechanisms underlying spatial release from masking in the auditory cortex are not fully understood. Here we employed single-unit recording and in vivo whole-cell patch-clamp recording techniques to examine how maskers from different spatial locations influence the detection thresholds of rat primary auditory cortex (A1) neurons in response to probe stimuli. At the cortical neuronal level, the masked detection thresholds of most A1 neurons in response to probes were significantly decreased when maskers were displaced from azimuths colocalized with the probe to other separated azimuths ipsilateral to the recording site. Similarly, at the cortical synaptic level, the masked detection thresholds of A1 neurons, as determined from the amplitude of evoked excitatory postsynaptic currents in response to probes presented at azimuth locations within the contralateral hemifield, were also decreased when maskers were shifted from azimuth locations in the contralteral hemifield to those in the ipsilateral hemifield. This study provides neuronal and synaptic evidences for spatial release from masking in the auditory cortex, advancing our understanding of the mechanisms involved in auditory signal processing in noisy environments.
Not All LoRA Parameters Are Essential: Insights on Inference Necessity
ArXiv.org · 2025-03-30
preprintOpen access1st authorCorrespondingCurrent research on LoRA primarily focuses on minimizing the number of fine-tuned parameters or optimizing its architecture. However, the necessity of all fine-tuned LoRA layers during inference remains underexplored. In this paper, we investigate the contribution of each LoRA layer to the model's ability to predict the ground truth and hypothesize that lower-layer LoRA modules play a more critical role in model reasoning and understanding. To address this, we propose a simple yet effective method to enhance the performance of large language models (LLMs) fine-tuned with LoRA. Specifically, we identify a ``boundary layer'' that distinguishes essential LoRA layers by analyzing a small set of validation samples. During inference, we drop all LoRA layers beyond this boundary. We evaluate our approach on three strong baselines across four widely-used text generation datasets. Our results demonstrate consistent and significant improvements, underscoring the effectiveness of selectively retaining critical LoRA layers during inference.
Geodata and AI. · 2025-11-30 · 1 citations
articleOpen access• The first expert-validated geological database in Macao is established. • A domain-adapted RAG-based LLM framework is developed for database standardization. • Interpretability analysis connects retrievals to model decisions. • Interpretability guides knowledge inputs for human-AI interaction. Urban geological characterization requires standardizing heterogeneous borehole data subject to interpretive variability from engineering practices. Current linguistic models struggle with dynamic yet limited geological datasets and lack transparent interpretation. This study develops a novel framework that incorporates large language models (LLMs) for intelligent formation-type standardization for urban geological databases. The Macao geological database, comprising 100 boreholes from 21 construction projects, has been established by engineering geologists. Input strategies, model uncertainty, and prediction states are analyzed to reveal performance-semantic relationships, optimizing expert-computer interaction for enhanced performance. Overall, the key contributions are as follows: (1) An expert-validated geological database across Macao is first developed for benchmarking; (2) A domain-adapted retrieval-augmented LLM framework is proposed for geological standardization; (3) Interpretability analysis is performed to link retrievals to model behaviors; (4) Model interpretability further guides further knowledge inputs to enhance performance. The study addresses critical gaps in enhancing transparent LLMs, supporting reliable human-AI collaboration for advanced geotechnical applications.
Poultry Science · 2025-10-25 · 1 citations
articleOpen accessThis study investigated the effects of dietary non-phytate phosphorus (NPP) levels on growth performance, bone mineralization, and mineral transporting gene expression in broilers (1-21 d), under a 2:1 calcium (Ca)-to-NPP ratio. The six Ca:NPP levels were 0.90 %:0.45 %, 0.80 %:0.40 %, 0.70 %:0.35 %, 0.60 %:0.30 %, 0.50 %:0.25 %, and 0.40 %:0.20 %, respectively. A total of 360 male broilers (day 1) were randomly assigned to 6 groups (5 cages per group, 12 birds per cage). Feed intake and weight gain (WG) were not affected by reductions in NPP from 0.45 % to 0.25 % (at a 2:1 Ca-to-NPP ratio) (P > 0.05), but WG was reduced in birds receiving 0.20 % NPP relative to those receiving 0.45 % NPP (P < 0.05). Femur and tibia quality were not altered by decreasing NPP from 0.45 % to 0.40 %, but further reduction to 0.20 % resulted in declines in bone weight and phosphorus (P) content (P < 0.05). Transcription of duodenal P transporter genes (NaPi-IIb and PiT-1), Ca transporter genes (NCX1, PMCA1b, and CaBP-D28k), as well as renal NaPi-IIa and CaBP-D28k, were upregulated in response to reduced dietary NPP and Ca levels (P < 0.05). In the duodenum, PiT-1 and NaPi-IIb transcription levels were elevated in birds receiving 0.20 %-0.40 % NPP relative to those receiving 0.45 % NPP (P < 0.05). Similarly, NCX1, PMCA1b, and CaBP-D28k transcription levels were increased in birds fed 0.40 %-0.50 % Ca relative to those fed 0.90 % Ca (P < 0.05). In the kidney, NaPi-IIa transcription levels were higher in birds fed 0.20 %-0.25 % NPP than in those receiving 0.45 % NPP (P < 0.05), whereas CaBP-D28k expression increased in birds receiving 0.40 %-0.60 % Ca relative to those receiving 0.90 % Ca (P < 0.05). These findings suggest that, under a constant Ca:NPP ratio of 2:1, moderate reductions in Ca and NPP do not impair growth performance of broilers (1-21 d) and may enhance mineral absorption via transcriptional up-regulation of transporter genes in the intestine and kidney.
medRxiv · 2025-04-22 · 16 citations
preprintOpen accessElectronic Health Records (EHRs) store vast amounts of clinical information that are difficult for healthcare providers to summarize and synthesize relevant details to their practice. To reduce cognitive load on providers, generative AI with Large Language Models have emerged to automatically summarize patient records into clear, actionable insights and offload the cognitive burden for providers. However, LLM summaries need to be precise and free from errors, making evaluations on the quality of the summaries necessary. While human experts are the gold standard for evaluations, their involvement is time-consuming and costly. Therefore, we introduce and validate an automated method for evaluating real-world EHR multi-document summaries using an LLM as the evaluator, referred to as LLM-as-a-Judge. Benchmarking against the validated Provider Documentation Summarization Quality Instrument (PDSQI)-9 for human evaluation, our LLM-as-a-Judge framework demonstrated strong inter-rater reliability with human evaluators. GPT-o3-mini achieved the highest intraclass correlation coefficient of 0.818 (95% CI 0.772, 0.854), with a median score difference of 0 from human evaluators, and completes evaluations in just 22 seconds. Overall, the reasoning models excelled in inter-rater reliability, particularly in evaluations that require advanced reasoning and domain expertise, outperforming non-reasoning models, those trained on the task, and multi-agent workflows. Cross-task validation on the Problem Summarization task similarly confirmed high reliability. By automating high-quality evaluations, medical LLM-as-a-Judge offers a scalable, efficient solution to rapidly identify accurate and safe AI-generated summaries in healthcare settings.
International Journal of Surgery · 2025-11-20
articleOpen access1st authorCorrespondingBACKGROUND: Intermediate-stage hepatocellular carcinoma (HCC) following curative liver resection (LR) is associated with high recurrence rates and poor survival outcomes. Current studies have found that transarterial chemoembolization can improve overall survival rates and disease free survival in patients with intermediate-stage HCC after surgery. However, the benefits of this treatment are limited. This study aimed to evaluate the benefit of triple adjuvant therapy-transarterial chemoembolization (TACE) combined with antiangiogenic therapy (lenvatinib) plus programmed death-1 inhibitors (TAP)-as an adjuvant treatment for resected intermediate-stage HCC, compared to TACE alone, and to identify patient subgroups most likely to benefit from the TAP regimen. MATERIALS AND METHODS: We collected data of patients with intermediate-stage HCC who underwent LR from December 2019 to December 2022. Disease-free survival (DFS) was compared between patients receiving TACE and those receiving TAP using propensity score matching. The 2-year recurrence rate in the entire cohort was predicted based on the TACE group, and the association between the predicted and observed recurrences was tested. OUTCOMES: A total of 571 patients were included in our study, with 102 receiving TAP and 469 receiving TACE. Compared with TACE alone, TAP showed better DFS (HR: 0.74; 95% CI: 0.56-0.98; P = 0.037), with median: 22.0 months (95% CI: 19.0-24.0) vs 25.6 months (95% CI: 24.0-40.0). The lines for the TACE and TAP intersected at 37% indicating that patients with a predicted 2-year recurrence risk >37% would significantly benefit from TAP. TAP therapy demonstrated a manageable AE profile, an overall AEs of 79.4%, with grade 1-2 accounting for 65.7%, grade 3 for 11.8%, and grade 4 for 1.9%. CONCLUSIONS: TAP therapy demonstrated significant potential as an adjuvant treatment for intermediate-stage HCC following curative resection, offering superior recurrence control and survival benefits compared to TACE alone. Patients with a predicted recurrence risk >37% showed improved DFS outcomes in TAP therapy, suggesting that recurrence risk thresholds could guide tailored treatment decisions in clinical practice. The manageable safety profile of TAP further supports its feasibility in the postoperative setting.Our findings represent a substantial advancement in the field of adjuvant therapy for resected intermediate-stage HCC.
MoMA: a mixture-of-multimodal-agents architecture for enhancing clinical prediction modelling
npj Digital Medicine · 2025-12-09 · 1 citations
articleOpen accessMultimodal electronic health record (EHR) data provide richer, complementary insights into patient health compared to single-modality data. However, effectively integrating diverse data modalities for clinical prediction modeling remains challenging due to the substantial data requirements. We introduce a novel architecture, Mixture-of-Multimodal-Agents (MoMA), designed to leverage multiple large language model (LLM) agents for clinical prediction tasks using multimodal EHR data. MoMA employs specialized LLM agents ("specialist agents") to convert non-textual modalities, such as medical images and laboratory results, into structured textual summaries. These summaries, together with clinical notes, are combined by another LLM ("aggregator agent") to generate a unified multimodal summary, which is then used by a third LLM ("predictor agent") to produce clinical predictions. Evaluating MoMA with different modality combinations and prediction settings, MoMA outperforms existing methods on three prediction tasks using private datasets, highlighting its enhanced accuracy and flexibility across various tasks.
Journal of Thoracic Disease · 2025-08-01
articleOpen access1st authorCorrespondingBackground: Against the backdrop of esophageal cancer's high global incidence, the dominant role of esophageal squamous cell carcinoma (ESCC) with poor prognosis, limited surgical opportunities, and the American Joint Committee on Cancer (AJCC) staging system's insufficiency, there is an urgent need to develop a prognostic nomogram for ESCC patients undergoing chemoradiotherapy. The purpose of this study was to establish a clinical nomogram for effectively predicting overall survival (OS) for patients with non-operated ESCC after definitive chemoradiotherapy. Methods: A total of 869 patients diagnosed with ESCC from 2010 to 2015 were retrieved from the Surveillance, Epidemiology, and End Results database. The nomogram was developed based on independent predictors determined by multivariate Cox regression analyses. Additional external validation was conducted on 318 ESCC patients enrolled from The First Affiliated Hospital of Nanjing Medical University. The receiver operating characteristic curve analysis and calibration plot were utilized to assess the predictive discriminative ability and reliability of the nomogram in both the training cohort and external validation cohort. The clinical practicability was evaluated by decision curve analysis and further comparing the novel model and the eighth edition of the American Joint Committee on Cancer (AJCC) staging system. Results: The multivariate analysis of the training cohort suggested that age, sex, tumor site, tumor size, clinical T stage, clinical N stage were significantly associated with OS and were all incorporated into the nomogram. The results suggested that the novel nomogram performed well with good discrimination and agreement and exhibited more optimal clinical benefits than AJCC 8th staging system. Meanwhile, an online web-server based on the new nomogram was developed for convenient clinical practice. Conclusions: The prognostic nomogram developed in this study demonstrates favorable predictive performance for survival outcomes in ESCC patients receiving definitive chemoradiotherapy. Its discriminative ability, consistency, and clinical benefits surpass those of the AJCC 8th Edition staging system. Additionally, a convenient online tool for the nomogram has been developed. This model can objectively quantify patients' survival risks, providing critical reference for the formulation of individualized treatment strategies and possessing clinical application value.
Recent grants
DMS/NIGMS 2: Unraveling the Role of the Human Microbiome to Advance Precision Medicine
NSF · $600k · 2021–2025
Frequent coauthors
- 61 shared
Jifan Gao
University of Wisconsin–Madison
- 51 shared
Thomas Yu
- 51 shared
Ivan Brugere
- 51 shared
Sean D. Mooney
National Institutes of Health
- 50 shared
Yonghwa Choi
Korea University
- 50 shared
Jaewoo Kang
- 50 shared
Łukasz Charzewski
University of Warsaw
- 50 shared
Renata Retkutė
University of Cambridge
Education
- 2004
Ph.D., Biostatistics
University of Wisconsin-Madison
- 2001
M.S., Biostatistics
University of Wisconsin-Madison
- 1998
B.S., Mathematics
University of Science and Technology of China
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Guanhua Chen
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup