About
Xiaofei Wang is a Professor of Biostatistics and Bioinformatics at Duke University and a member of the Duke Cancer Institute. His role involves leading research and education in biostatistics and bioinformatics, contributing to the advancement of statistical methods and computational approaches in biomedical research. His work supports the integration of statistical genetics, genomics, and translational biomedical informatics, aiming to improve understanding and treatment of cancer and other diseases. Based at Duke University, he is actively engaged in academic leadership, research initiatives, and collaborative efforts within the Duke community.
Research topics
- Internal medicine
- Medicine
- Computer Science
- Machine Learning
- Artificial Intelligence
- Oncology
- Psychology
- Psychiatry
- Surgery
Selected publications
2026-01-12
reportOpen accessThis poster presents a hierarchical control framework for a virtual power plant that leverages behind-the-meter resources for grid services while maintaining customer privacy during setpoint disaggregation. Unlike many existing approaches, the virtual power plant model uses a hierarchical control strategy and an iterative approach to determine the optimal set point dis-aggregation without direct load control while maintaining system-level power flow and voltage constraints. The proposed approach is numerically validated on a synthetic distribution feeder in San Francisco, demonstrating the ability of the framework to provide privacy-preserving virtual power plant services.
Scheduler Modeling of Distributed Energy Resources for Providing Ancillary Services
2026-02-13
reportOpen access1st authorCorrespondingDistribution energy resources (DERs) have been integral components of modern power systems, and their capability to provide grid services has been widely studied. To promote the deployment of these resources in providing grid services in real-world utility operations, this paper proposes a day-ahead scheduler model for a distribution system connected DER plant. A certain amount of generation capacity of this DER plant is reserved for frequency services, and some ancillary services for the distribution system-including peak load reduction, voltage regulation, and power factor control-are integrated into the model. The model is tested on a real-world distribution system. From the simulation results, the energy and reserve schedule of the solar photovoltaic (PV) unit and battery energy storage system (BESS) can be determined, and voltage and power factor are well maintained. Additionally, in order to demonstrate the specific characteristics of the co-located and hybrid operation modes for the PV and BESS, these two modes are analyzed both theoretically and through real-time simulation. Simulation results show that most of PV's variability is transferred to the net power in the co-located mode, whereas it is transferred to the BESS in the hybrid mode. This proposed scheduler model and the comparison of co-located and hybrid modes can provide practical guidance for the applications of DER plant in the real-world utility.
Statistics in Medicine · 2025-02-18 · 5 citations
articleOpen accessThe accessibility of individual participant-level data (IPD) enhances the evaluation of moderation effects of patient covariates. It facilitates the provision of accurate estimation of intervention effects and confidence intervals by incorporating covariate correlations across multiple clinical trials. With a time-to-event outcome, Cox regression can be applied for network meta-analysis (NMA) using IPD. However, there lacks comprehensive reviews and comparisons of the specifications and assumptions of these Cox models and their impact on the interpretation of hazard ratios, effect moderation, and trial heterogeneity in IPD-NMA. In this paper, we examine various Cox models for IPD-NMA and compare different approaches to modeling trial, treatment, and covariate effects. We employ multiple graphical tools and statistical tests to assess proportional hazard assumptions and discuss their implications. Additionally, we explore the application of extended Cox models when the proportional hazard assumption is violated. Practical guidance on interpreting and reporting NMA results is provided. A simulation study is conducted to compare the performance of different models. We illustrate the methods to conduct IPD-NMA through a real data example.
Scalable Hybrid Large-Scale dc-ac Grid Analysis Methods (Phase I)
2025-08-01
reportOpen accessThe goals of the project included the development of characterization methods and tools to evaluate reliability, transient stability, and economics of large-scale dc architectures in ac grids. | OSTI.GOV
Real-Time DER Plant Dispatcher for Network-Wide Grid Support
2025-07-27
articleDistribution system connected distributed energy resource (DER) power plants (e.g., solar farms and battery energy storage systems) have gained popularity over the past few years because of short interconnection timings. In addition to being a renewable power source, DER power plants, if controlled effectively, have the opportunity to provide network-wide services to the grid, such as voltage support for a distribution network, maintaining substation power factor, enabling a distribution network to be a virtual power plant, and providing operating reserves to a transmission network. In this paper, we design a real-time DER plant dispatcher that decides the power set points and has the capability to provide the aforementioned grid services. The dispatcher is designed under a primal-dual feedback control framework that integrates its operation with the local power plant controller. It is evaluated on a model of a real-world distribution network, and the results show that it can provide services to the grid that are beyond the capabilities of the currently existing control logic. This dispatcher design and evaluation demonstrate the potential network-wide benefits that could be unlocked from a DER plant when controlled intentionally as opposed to be a burden on the reliability of the grid.
Journal of Biopharmaceutical Statistics · 2025-04-20 · 3 citations
articleOpen accessSenior authorCorrespondingBiomarker-guided designs are increasingly used to evaluate personalized treatments based on patients' biomarker status in Phase II and III clinical trials. With adaptive enrichment, these designs can improve the efficiency of evaluating the treatment effect in biomarker-positive patients by increasing their proportion in the randomized trial. While time-to-event outcomes are often used as the primary endpoint to measure treatment effects for a new therapy in severe diseases like cancer and cardiovascular diseases, there is limited research on biomarker-guided adaptive enrichment trials in this context. Such trials almost always adopt hazard ratio methods for statistical measurement of treatment effects. In contrast, restricted mean survival time (RMST) has gained popularity for analyzing time-to-event outcomes because it offers more straightforward interpretations of treatment effects and does not require the proportional hazard assumption. This paper proposes a two-stage biomarker-guided adaptive RMST design with threshold detection and patient enrichment. We develop sophisticated methods for identifying the optimal biomarker threshold and biomarker-positive subgroup, treatment effect estimators, and approaches for type I error rate, power analysis, and sample size calculation. We present a numerical example of re-designing an oncology trial. An extensive simulation study is conducted to evaluate the performance of the proposed design.
EBUS Diagnostic Yield for Sarcoidosis in Hilar vs. Mediastinal Lymph Nodes
medRxiv · 2025-04-21 · 1 citations
preprintOpen accessAbstract Background Pulmonary sarcoidosis is diagnosed by endobronchial ultrasound-guided transbronchial needle aspirate (EBUS-TBNA) of hilar and mediastinal lymph nodes and the finding of non-caseating granulomatous inflammation. There are currently no guidelines about which lymph node stations to sample to optimize the diagnostic yield and it is unclear if there is a difference in the yield between hilar and mediastinal lymph node stations. Methods A retrospective study was performed to assess the difference in the diagnostic yield of EBUS-TBNA for non-caseating granulomas between hilar and mediastinal lymph nodes. Results Two hundred twenty-five patients with suspicion of sarcoidosis underwent EBUS-TBNA for evaluation of hilar and mediastinal lymphadenopathy. The yield of EBUS-TBNA for non-caseating granulomas was 61.8% vs. 65.5%, P = 0.46, for hilar and mediastinal lymph nodes, respectively. The sensitivity for sarcoidosis of EBUS-TBNA of hilar vs. mediastinal nodes was 66.9% (95% confidence interval or CI, 58.9%-74.9%) vs. 71.1% (95% CI, 65.3%-76.9%). The specificity for sarcoidosis of EBUS-TBNA of both hilar and mediastinal nodes was 100%. The diagnostic yield for non-caseating granulomas in patients who underwent hilar nodes biopsy only, mediastinal nodes biopsy only, and both hilar and mediastinal nodes biopsy was 71.4%, 67%, and 73.1%, respectively (P=0.63). In multivariable logistic regression analysis, the diagnostic yield of EBUS-TBNA was only associated with age (OR 0.96; 95% CI 0.94-0.98; P <0.01). Conclusions The yield of EBUS-TBNA for non-caseating granulomas in patients with suspected sarcoidosis was similar between the hilar and mediastinal lymph nodes.
ArXiv.org · 2025-03-19
preprintOpen accessSenior authorThe heterogeneous treatment effect plays a crucial role in precision medicine.There is evidence that real-world data, even subject to biases, can be employed as supplementary evidence for randomized clinical trials to improve the statistical efficiency of the heterogeneous treatment effect estimation. In this paper, for survival data with right censoring, we consider estimating the heterogeneous treatment effect, defined as the difference of the treatment-specific conditional restricted mean survival times given covariates, by synthesizing evidence from randomized clinical trials and the real-world data with possible biases. We define an omnibus bias function to characterize the effect of biases caused by unmeasured confounders, censoring, and outcome heterogeneity, and further, identify it by combining the trial and real-world data. We propose a penalized sieve method to estimate the heterogeneous treatment effect and the bias function. We further study the theoretical properties of the proposed integrative estimators based on the theory of reproducing kernel Hilbert space and empirical process. The proposed methodology outperforms the approach solely based on the trial data through simulation studies and an integrative analysis of the data from a randomized trial and a real-world registry on early-stage non-small-cell lung cancer.
The Oncologist · 2025-07-10
articleOpen accessMedullary thyroid carcinoma is a rare thyroid malignancy derived from parafollicular C cells that is frequently driven by activating mutations in the REarranged during Transfection (RET) proto-oncogene. While most actionable RET mutations are located in the extracellular cysteine-rich or intracellular tyrosine kinase domains, mutations in the transmembrane domain are exceedingly rare and their oncogenic significance remains unclear. We report a case of a 59-year-old male with sporadic medullary thyroid carcinoma harboring a rare RET A641R mutation in the transmembrane domain. The patient experienced multiple locoregional recurrences after four surgical resections. While the companion diagnostic test did not identify RET mutations, comprehensive genomic profiling using a next-generation sequencing panel revealed the RET A641R mutation. Following administration of selpercatinib, a selective RET inhibitor, a rapid biochemical response with decreased serum carcinoembryonic antigen and calcitonin levels was observed, and radiological assessment showed partial response. This is the first report demonstrating the clinical efficacy of selpercatinib in a patient with medullary thyroid carcinoma harboring a RET A641R mutation, supporting the oncogenic potential of this rare variant. This case also emphasizes the importance of comprehensive genomic profiling in identifying rare but actionable RET alterations that are undetectable by targeted sequencing companion diagnostic tests. Selpercatinib may represent an effective therapeutic option for patients with medullary thyroid carcinoma driven by uncommon RET mutations, including mutations in the transmembrane domain.
Integrative analysis of high-dimensional RCT and RWD subject to censoring and hidden confounding
Lifetime Data Analysis · 2025-04-29
articleOpen accessIn this study, we focus on estimating the heterogeneous treatment effect (HTE) for survival outcome. The outcome is subject to censoring and the number of covariates is high-dimensional. We utilize data from both the randomized controlled trial (RCT), considered as the gold standard, and real-world data (RWD), possibly affected by hidden confounding factors. To achieve a more efficient HTE estimate, such integrative analysis requires great insight into the data generation mechanism, particularly the accurate characterization of unmeasured confounding effects/bias. With this aim, we propose a penalized-regression-based integrative approach that allows for the simultaneous estimation of parameters, selection of variables, and identification of the existence of unmeasured confounding effects. The consistency, asymptotic normality, and efficiency gains are rigorously established for the proposed estimate. Finally, we apply the proposed method to estimate the HTE of lobar/sublobar resection on the survival of lung cancer patients. The RCT is a multicenter non-inferiority randomized phase 3 trial, and the RWD comes from a clinical oncology cancer registry in the United States. The analysis reveals that the unmeasured confounding exists and the integrative approach does enhance the efficiency for the HTE estimation.
Recent grants
NIH · $1.6M · 2020–2025
NIH · $373k · 2016
NIH · $135.9M · 1997–2030
NIH · $156k · 2010
Frequent coauthors
- 116 shared
Everett E. Vokes
University of Chicago
- 97 shared
Herbert Pang
- 95 shared
Lin Gu
- 90 shared
Thomas E. Stinchcombe
Duke University
- 83 shared
Robert A. Kratzke
- 72 shared
David H. Harpole
Durham VA Health Care System
- 70 shared
Martin J. Edelman
Realistic Education in Action Coalition to Foster Health
- 58 shared
Lydia Hodgson
Miami Heart Research Institute
Education
PhD, Biostatistics
University of North Carolina
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Xiaofei Wang
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup