About
Yuxin Chen is a Professor of Statistics and Data Science and Electrical and Systems Engineering at the University of Pennsylvania. His research focuses on machine learning theory, particularly diffusion models, LLM, and RL, as well as statistics and optimization. He has a strong background in electrical engineering and has previously held positions at Princeton University and Stanford University.
Research topics
- Medicine
- Internal medicine
- Virology
- Computer Science
- Political Science
- Emergency medicine
- Mathematics
- Oncology
- Medical emergency
- Pediatrics
- Gastroenterology
- Demography
- Statistics
- Intensive care medicine
Selected publications
2026-03-12
peer-reviewSenior authorInternational Journal for Equity in Health · 2026-04-17
articleOpen accessCorrespondingTo analyze the changes in equity of health resources and service utilization (HRSU) for infectious diseases and to assess the impact of COVID-19 and its associated socioeconomic changes on this equity. This study utilized 14-year provincial panel data from 31 provinces in mainland China. Data were sourced from the China Health Statistics Yearbook (2011–2024). Equity in infectious disease HRSU was assessed using four key indicators. Specifically, health resource indicators included the number of CDC physicians and infectious disease beds, while service utilization indicators included the infectious disease outpatient rate and inpatient rate. A GDP-weighted Theil index was applied to analyze equity before and after the COVID-19 pandemic, and an Interrupted Time Series Analysis (ITSA) was conducted to quantify the pandemic’s impact on equity. Post-COVID-19, the Theil index decreased for CDC physicians, infectious disease beds, utilization of outpatient service (UOS) and utilization of inpatient service (UIS). ITSA showed that after the pandemic, the pandemic had a statistically significant impact on the Theil index of infectious disease beds and UIS, with the difference between the pre-pandemic and post-pandemic slopes of -0.0132 and - 0.0077 (p < 0.001). Conversely, the Theil index of UOS decreased immediately after the pandemic (β2= -0.0106, p < 0.001), but there was no sustained effect. Epidemic shocks have catalyzed policy responses, driving sustained gains in equity for infectious disease beds and inpatient service access, but only transient improvements in equity of UOS. The epidemic’s socio-economic impacts and inherent regional-economic disparities exert long-term effects on healthcare equity. A Health Priority Development Strategy could contribute to lowering infectious disease incidence and promoting sustained equity in health service utilization. Not applicable.
Nature Communications · 2026-04-17
articleOpen accessSenior authorThe effectiveness of COVID-19 vaccination in children and adolescents with prior SARS-CoV-2 infection remains unclear, particularly for Omicron subvariants. We evaluate vaccine effectiveness against reinfection with Omicron BA.1/BA.2, BA.4/BA.5, XBB, and later subvariants among 5- to 17-year-olds using data from the RECOVER initiative, a national electronic health record database covering 37 U.S. children’s hospitals and health institutions. We emulate target trials by age group and variant period, comparing previously infected participants between January 2022 and August 2023. During the BA.1/BA.2 period, vaccination reduces the risk of reinfection, with effectiveness rates of 62% in children and 65% in adolescents. During the BA.4/BA.5 period, protection effectiveness in children was 57%, whereas no statistically significant protection is observed in adolescents. During the XBB and later period, no significant protection is observed in either group. In summary, COVID-19 vaccination provides protection against reinfection during the early and mid-Omicron periods in previously infected pediatric populations, but effectiveness declines for later variants. This study found COVID-19 vaccination offers added protection against reinfection in children and adolescents with prior infection but variable effectiveness across variants. Findings support vaccination benefits beyond infection-acquired immunity.
Frontiers in Oncology · 2026-04-15 · 1 citations
articleOpen accessSenior authorConcurrent development of therapy−related acute myeloid leukemia (t−AML) and lymph node tuberculosis (LNTB) following comprehensive anti−tumor therapy for locally advanced lung squamous cell carcinoma (LSCC) is extremely rare in clinical practice. This is not a new biological concept but represents a rarely documented clinical scenario in the setting of neoadjuvant chemoimmunotherapy, surgery, and anti–PD−1 maintenance therapy. Herein, we systematically summarize the clinical features, pathogenesis and individualized therapeutic strategies of these two concurrent rare complications based on a single rare case and the latest relevant literature, to provide a reference for clinical diagnosis and treatment. A 54-year-old male patient was initially diagnosed with locally advanced LSCC. After four cycles of neoadjuvant therapy with carboplatin, albumin-bound paclitaxel and pembrolizumab, the tumor lesion regressed markedly. Thoracoscopic right upper lobectomy was then performed, followed by maintenance immunotherapy with single-agent pembrolizumab postoperatively. Four months after maintenance therapy, the patient developed abnormalities on routine blood work, low-grade fever, fatigue and superficial lymphadenopathy. t-AML was confirmed by bone marrow aspiration, immunophenotyping, gene mutation and cytogenetic examinations, accompanied by breast cancer susceptibility gene 2 ( BRCA2 ), DNA (cytosine-5)-methyltransferase 3 alpha ( DNMT3A ), and isocitrate dehydrogenase 2 ( IDH2 ) mutations, and positivity for lysine (K)-specific methyltransferase 2A partial tandem duplication ( KMT2A-PTD ). Meanwhile, LNTB was diagnosed by lymph node aspiration pathology combined with tuberculosis-specific assays. The patient was treated with an optimized quadruple anti-tuberculosis regimen (HZEM), and induction chemotherapy for AML with VA regimen (venetoclax plus azacitidine) plus revumenib, and supportive therapy. Subsequently, the patient achieved partial remission of leukemia, with no uncontrollable severe adverse events. In this case, LSCC was managed with neoadjuvant therapy, thoracoscopic right upper lobectomy and postoperative maintenance therapy with single agent pembrolizumab. The development of t−AML is primarily driven by cytotoxic DNA damage induced by chemotherapeutic agents, whereas the potential contribution of immune checkpoint inhibitors remains largely speculative. The potential contribution of immune checkpoint inhibitors via immune microenvironmental disturbance remains largely speculative and insufficiently documented by current clinical evidence. The impaired immune function caused by comprehensive anti-tumor therapy may further elevate the risk of LNTB. The overlapping clinical manifestations of the two concurrent diseases substantially increase diagnostic difficulty. Timely and thorough bone marrow examination, lymph node pathological biopsy and tuberculosis-specific screening are the keys to early and accurate diagnosis.
Cancer Biotherapy and Radiopharmaceuticals · 2026-05-04
articleSenior authorCorrespondingBackground: In this single-center retrospective study, the authors evaluated whether real-time ultrasound-guided positioning of an implantable venous access port catheter tip at the superior vena cava-right atrial junction (SVC-RAJ) reduces the risk of catheter-related thrombosis (CRT) in adult patients with cancer and developed a multivariable risk prediction model to support individualized prevention. Methods: Clinical data from 600 consecutive patients who underwent port implantation at Zhongshan People’s Hospital were analyzed. Patients were grouped according to final catheter tip position (SVC-RAJ versus non-SVC-RAJ), and CRT incidence was compared between groups. Results: The overall incidence of CRT was 30.33% (182/600) and was significantly lower in the SVC-RAJ group than in the non-SVC-RAJ group (22.42% vs. 38.73%, p < 0.001). In multivariable analysis, catheter tip positioning at the SVC-RAJ remained an independent protective factor (odds ratio = 0.517, 95% confidence interval [CI]: 0.353–0.756). Age, body mass index (BMI), tumor stage, neutrophil-to-lymphocyte ratio, D-dimer level, catheterization duration, and prophylactic anticoagulation status were also independently associated with CRT. A nomogram integrating these variables demonstrated good discrimination (area under the curve = 0.866, 95% CI: 0.837–0.895), with a sensitivity of 70.33% and a specificity of 85.89%. Performance across specific age or BMI strata was not separately evaluated in this study, and further stratified validation in larger datasets is needed to assess model consistency across demographic subgroups. Conclusions: These findings support ultrasound-guided SVC-RAJ positioning as a clinically relevant strategy for reducing CRT risk and maintaining reliable venous access in contemporary oncology care pathways.
Meta-Analysis and Federated Learning over Decentralized Distributed Research Networks
Annual Review of Biomedical Data Science · 2025-08-11 · 2 citations
reviewOpen accessSenior authorDistributed research networks have transformed modern clinical research by enabling large-scale, multi-institutional collaborations while maintaining patient privacy. Two prominent methodologies within these frameworks-meta-analysis and federated learning-address the challenges of synthesizing evidence from decentralized data. Meta-analysis aggregates study-level results to provide robust, interpretable estimates, making it a cornerstone of evidence synthesis for association studies. Federated learning complements this by enabling complex downstream tasks, such as predictive modeling and counterfactual inference, while preserving data privacy through privacy-preserving distributed algorithms. Federated learning facilitates communication-efficient computation and adapts seamlessly to heterogeneous datasets across diverse institutions. This review emphasizes the complementary strengths of federated learning's scalability, flexibility, and readiness for implementation alongside meta-analysis's robust frameworks for evidence synthesis and aggregation in clinical research. Integrations of synthetic data, artificial intelligence (AI)-enhanced harmonization, and hybrid human-AI frameworks are proposed as future directions, promising to further advance both methodologies and enhance their combined impact on privacy-conscious, data-driven healthcare research.
Novel <i>KDM3B</i> Variants in Two Chinese Patients With Global Developmental Delay and Autism
International Journal of Developmental Neuroscience · 2025-12-01
articleBACKGROUND: Haploinsufficiency of KDM3B has also been linked to developmental delay, intellectual disability, autism spectrum disorder (ASD) and immunodeficiency known as developmental delay, intellectual disability, joint contractures and facial dysmorphism; immunodeficiency; and short stature (DIJOS) syndrome. However, the phenotypic spectrum is not fully defined, and genotype-phenotype associations need to be further studied. METHODS: Here we report on two unrelated patients with global developmental delay and autistic features and provide detailed clinical information of both patients, including cranial magnetic resonance imaging (MRI), electroencephalography (EEG), metabolic screening, hearing assessment and neurodevelopmental testing. Whole exome sequencing (WES) was performed for potential genetic causes, and candidate variants were verified via Sanger sequencing. Interpretation of variants was performed in accordance with ACMG guidelines. RESULTS: For Patient 1, we detected a de novo pathogenic heterozygous nonsense variant in KDM3B (c.1970C > G, p.Ser657*). The canonical splice-site variant (c.3973-1G > C) in KDM3B that we found in Patient 2 was classified as likely pathogenic. Clinically, Patient 1 had severe developmental retardation, deafness and autistic tilt, whereas Patient 2 had milder retardation and autistic behaviours with normal hearing. The splice-site variant in Patient 2 may disrupt an upstream intron and is predicted to influence splicing, which may elicit nonsense-mediated mRNA decay and contribute a more severe interference, comparatively. CONCLUSION: Our results broaden the mutational and phenotypic spectrum of KDM3B-related disorder and highlight the phenotypic heterogeneity even in patients with the same type of variant. Functional analysis underscores the importance of KDM3B in neurodevelopment, optic nerve formation and cognition. Additional studies will be required to define the differences in clinical phenotype at the molecular level.
Assessing Covariate Balance With Small Sample Sizes
Statistics in Medicine · 2025-08-01 · 5 citations
articleOpen accessPropensity score adjustment addresses confounding by balancing covariates in subject treatment groups through matching, stratification, or weighting. Diagnostics test the success of adjustment. For example, if the standardized mean difference (SMD) for a relevant covariate exceeds a threshold like 0.1, the covariate is considered imbalanced and the study may be biased. Unfortunately, for studies with small or moderate numbers of subjects, the probability of identifying a study as biased because of chance imbalance can be grossly larger than a given nominal level like 0.05, yet that chance imbalance may not cause significant bias. In this paper, we illustrate that chance imbalance is operative in real-world settings even for moderate sample sizes of 2000. We identify a previously unrecognized challenge that as meta-analyses increase the precision of an effect estimate, the diagnostics must also undergo meta-analysis for a corresponding increase in precision. We propose an alternative diagnostic that checks whether the SMD statistically significantly exceeds the threshold. Through simulation and real-world data, we find that this diagnostic achieves a better trade-off of type 1 error rate and power than standard nominal threshold tests and not testing for sample sizes from 250 to 4000 and for 20 to 100 000 covariates. We confirm that in network studies, meta-analysis of effect estimates must be accompanied by meta-analysis of the diagnostics or else systematic confounding may overwhelm the estimated effect. Our procedure supports the review of large numbers of covariates, enabling more rigorous diagnostics.
Prescribing of GLP-1 Receptor Agonists for Adolescents with Obesity and Associated Disparities
medRxiv · 2025-10-07
preprintOpen accessThis retrospective cohort study examined national prescribing patterns of glucagon like peptide 1 receptor agonists (GLP1RAs) for U.S. adolescents with obesity using electronic health record data from over 2 million patients aged 12 to 17 years between 2021 and 2025. Among eligible adolescents, only 0.9 received a GLP1RA prescription, though use increased after semaglutide approval in December 2022, with semaglutide rapidly surpassing liraglutide and off-label tirzepatide use rising by 2025. Prescribing was strongly associated with clinical and sociodemographic factors: adolescents with severe obesity were more likely to be prescribed, while males, Hispanic/Latino and non Hispanic Black youth, non English/Spanish speakers, those living in rural or socioeconomically disadvantaged areas, and those with Medicaid or self pay coverage were significantly less likely to receive prescriptions. These findings highlight growing uptake of GLP1RAs but reveal substantial disparities in access, suggesting insurance barriers and structural inequities may limit availability for groups disproportionately affected by obesity.
Journal of the American Medical Informatics Association · 2025-08-02 · 1 citations
articleSenior authorOBJECTIVE: Patients of different race have different outcomes following renal transplantation. Patients of different race also undergo renal transplantation at different hospitals. We used a novel decentralized multisite approach to quantitatively assess the effect of site of care on racial disparities between non-Hispanic Black (NHB) and non-Hispanic White (NHW) patients in post-transplantation survival times. MATERIALS AND METHODS: In this study, we develop a communication-efficient federated learning algorithm to assess site-of-care associated racial disparities based on decentralized time-to-event data, called Communication-Efficient Distributed Analysis for Racial Disparity in Time-to-event Data (CEDAR-t2e). The algorithm includes 2 modules. Module I is to estimate the site-specific proportional hazards model for time-to-event outcomes in a distributed manner, in which the Poissonization is used to simplify the estimation procedure. Based on the estimated results from Module I, Module II calculates how long the kidney failure time of NHB patients would be extended had they been admitted to transplant centers in the same distribution as NHW patients were admitted. RESULTS: With application to United States Renal Data System data covering 39 043 patients across 73 transplant centers, we found no evidence suggesting the presence of site-of-care associated racial disparities in post-transplantation survival times. In particular, restricting to one year after transplantation, the counterfactual graft failure time would have been extended by only 0.61 days on average if NHB had the same admission distribution to transplant centers as NHW patients. DISCUSSION: The proposed approach offers a quantitative measure to evaluate site-of-care associated racial disparities. CONCLUSION: Our approach has the potential to be extended to investigate site-of-care related disparities in other time-to-event outcomes, thus promoting health equity and improving patient health in various fields.
Recent grants
CICADA: clinical informatics and computational approaches for drug-repositioning of AD/ADRD
NIH · $1.5M · 2021–2025
Dynamic learning for post-vaccine event prediction using temporal information in VAERS
NIH · $3.4M · 2017–2024
Multivariate Meta-anaylsis of Diagnostic Tests
NIH · $100k · 2014–2016
A General Framework to Account for Outcome Reporting Bias in Systematic Reviews
NIH · $1.4M · 2017–2021
Surrogate Augmented Deep Predictive Learning for Retinopathy of Prematurity
NIH · $482k · 2023–2026
Frequent coauthors
- 106 shared
Jiayi Tong
University of Pennsylvania
- 86 shared
Haitao Chu
- 83 shared
Fei Wang
Boehringer Ingelheim (United States)
- 76 shared
Chongliang Luo
Washington University in St. Louis
- 70 shared
Christopher R. Forrest
- 69 shared
Payal Patel
National Institutes of Health
- 64 shared
Thomas W. Carton
Louisiana Public Health Institute
- 64 shared
Christopher B. Forrest
Children's Hospital of Philadelphia
Education
- 2015
Ph.D., Electrical Engineering
Stanford University
- 2017
Other, Statistics
Stanford University
Awards & honors
- SIAM Activity Group on Imaging Science Best Paper Prize
- IEEE Transactions on Power Electronics Prize Paper Award (fi…
- ICCM Best Paper Award (Gold Medal)
- Finalist for Best Paper Prize for Young Researchers in Conti…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Yuxin Chen
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup