Siyu Heng

· Assistant Professor of BiostatisticsVerified

New York University · Department of Biostatistics

Active 2016–2025

h-index6

Citations147

Papers5653 last 5y

Funding—

Faculty page

See your match with Siyu Heng — sign in to PhdFit.Sign in

About

Siyu Heng, PhD, is an Assistant Professor in the Department of Biostatistics at NYU School of Global Public Health. His research interests encompass methodology and applied research in causal inference, health data science, observational studies, randomized trials, sensitivity analysis, instrumental variables, measurement error, and survey data applications in public health. Dr. Heng has published in prominent journals such as the Journal of the Royal Statistical Society and Physical Review, and has received multiple awards including the IPUMS Global Health Research Award, the Lawrence D. Brown Best Paper Award, and the Wellcome Trust Data Reuse Prize. He earned his PhD in applied mathematics and computational science from the University of Pennsylvania and holds a BA in statistics from Nanjing University.

Research topics

Medicine
Computer Science
Biology
Econometrics
Internal medicine
Cardiology
Mathematical economics
Mathematics
Statistics
Surgery

Selected publications

RIIM: Randomization-Based Inference Under Inexact Matching
2025-03-12
datasetOpen accessSenior author
Randomization-based inference for average treatment effects in potentially inexactly matched observational studies. It implements the inverse post-matching probability weighting framework proposed by the authors. The post-matching probability calculation follows the approach of Pimentel and Huang (2024) <<a href="https://doi.org/10.1093%2Fjrsssb%2Fqkae033" target="_top">doi:10.1093/jrsssb/qkae033</a>>. The optimal full matching method is based on Hansen (2004) <<a href="https://doi.org/10.1198%2F106186006X137047" target="_top">doi:10.1198/106186006X137047</a>>. The variance estimator extends the method proposed in Fogarty (2018) <<a href="https://doi.org/10.1111%2Frssb.12290" target="_top">doi:10.1111/rssb.12290</a>> from the perfect randomization settings to the potentially inexact matching case. Comparisons are made with conventional methods, as described in Rosenbaum (2002) <<a href="https://doi.org/10.1007%2F978-1-4757-3692-2" target="_top">doi:10.1007/978-1-4757-3692-2</a>>, Fogarty (2018) <<a href="https://doi.org/10.1111%2Frssb.12290" target="_top">doi:10.1111/rssb.12290</a>>, and Kang et al. (2016) <<a href="https://doi.org/10.1214%2F15-aoas894" target="_top">doi:10.1214/15-aoas894</a>>.
Publisher OA PDF DOI
A Non-Bipartite Matching Framework for Difference-in-Differences with General Treatment Types
ArXiv.org · 2025-11-26
preprintOpen access1st authorCorresponding
Difference-in-differences (DID) is one of the most widely used causal inference frameworks in observational studies. However, most existing DID methods are designed for binary treatments and cannot be readily applied to non-binary treatment settings. Although recent work has begun to extend DID to non-binary (e.g., continuous) treatments, these approaches typically require strong additional assumptions, including parametric outcome models or the presence of idealized comparison units with (nearly) static treatment levels over time (commonly called ``stayers'' or ``quasi-stayers''). In this technical note, we introduce a new non-bipartite matching framework for DID that naturally accommodates general treatment types (e.g., binary, ordinal, or continuous). Our framework makes three main contributions. First, we develop an optimal non-bipartite matching design for DID that jointly balances baseline covariates across comparable units (reducing bias) and maximizes contrasts in treatment trajectories over time (improving efficiency). Second, we establish a post-matching randomization condition, the design-based counterpart to the traditional parallel-trends assumption, which enables valid design-based inference. Third, we introduce the sample average DID ratio, a finite-population-valid and fully nonparametric causal estimand applicable to arbitrary treatment types. Our design-based approach that preserves the full treatment-dose information, avoids parametric assumptions, does not rely on the existence of stayers or quasi-stayers, and operates entirely within a finite-population framework, without appealing to hypothetical super-populations or outcome distributions.
Publisher OA PDF DOI
Peri-Rolandic and Occipital Sparing Cortical Edema: A Prevalent MRI Finding in Pediatric Patients with Cerebral Malaria
American Journal of Neuroradiology · 2025-07-30
articleOpen access
<h3>ABSTRACT</h3> <h3>BACKGROUND AND PURPOSE:</h3> Cerebral malaria is a leading cause of childhood mortality and neurological morbidity in sub-Saharan Africa and South Asia; and a strong association between diffuse brain swelling and mortality has been well established. Our goal was to characterize patterns of cortical edema on brain MRI in children with cerebral malaria and determine their association with patient outcomes. <h3>MATERIALS AND METHODS:</h3> We retrospectively reviewed admission brain MR images obtained from Malawian children with clinical cerebral malaria admitted at a single center from 2013-2019. Two neuroradiologists assessed the pattern of cortical edema on T1-, T2-, and diffusion-weighted images using a consensus approach. The overall degree of brain volume (brain volume score) and other brain imaging findings were also assessed, including focal signal changes in the basal ganglia, white matter, and posterior fossa. We evaluated the frequency and associations of these imaging findings with clinical outcomes at hospital discharge (deceased, alive with neurological sequelae, or alive without neurological sequelae). <h3>RESULTS:</h3> We included admission brain MRI scans from 190 children with clinical cerebral malaria. Cortical edema was identified in 163 MRIs. The predominant pattern of cortical edema was diffuse cortical involvement with relative sparing of the occipital and peri-Rolandic areas: 103 (63.2%) had this pattern, whereas 37 (22.7%) had sparing of the occipital cortex only and 23 (14.1%) had generalized cortical edema without focal sparing. The presence of occipital and peri-Rolandic sparing inversely correlated with brain volume score (β=-0.26, p<0.001) and outcomes (OR [95% CI]: 0.3 [0.1-0.6], p=0.002). <h3>CONCLUSIONS:</h3> Pediatric cerebral malaria is associated with a typical pattern of cortical edema that relatively spares the occipital and peri-Rolandic areas, which become progressively involved with more severe disease. ABBREVIATIONS: CM = Cerebral Malaria; BVS = Brain Volume Score; DWI = Diffusion-Weighted Imaging; PRES = Posterior Reversible Encephalopathy Syndrome.
Publisher DOI
Effects of behavioral intervention components to increase COVID-19 testing for African American/Black and Latine frontline essential workers not up-to-date on COVID-19 vaccination: Results of an optimization randomized controlled trial
Journal of Behavioral Medicine · 2025-04-16 · 1 citations
articleOpen access
Publisher OA PDF DOI
Neutralizing and binding antibodies are a correlate of risk of COVID-19 in the CoVPN 3008 study in people with HIV
UNC Libraries · 2025-10-16
articleOpen access
People with HIV (PWH) are understudied in COVID-19 vaccine trials, leaving knowledge gaps on whether the identified immune correlates of protection also hold in PWH. CoVPN 3008 (NCT05168813) enrolled predominantly PWH and reported lower COVID-19 incidence for a Hybrid vs. Vaccine Group (baseline SARS-CoV-2-positive and one mRNA-1273 dose vs. negative and two doses). Using case-cohort sampling, antibody markers at enrolment (M0) and four weeks post-final vaccination (Peak) are assessed as immune correlates of COVID-19. For the Hybrid Group [n = 287 (195 PWH)], all M0 markers inversely correlate with COVID-19 through 230 days post-Peak, with 50% inhibitory dilution BA.4/5 neutralizing antibody titer (nAb-ID50 BA.4/5) the strongest and only independent correlate (HR per 10-fold increase=0.46, 95% CI 0.28, 0.75; P = 0.002). For the Vaccine Group [n = 115 (86 PWH)], Peak nAb-ID50 BA.4/5 correlates with reduced COVID-19 risk (1.9%, 1.1%, and 0.3% at titers 10, 100, and 1000 AU/ml) through 92, but not 165, days post-Peak. Using multivariable Cox analysis of binding and nAb, nAb titers predict COVID-19 in PWH. Two doses of a 100-µg Ancestral strain mRNA vaccine in baseline-SARS-CoV-2-negative individuals elicit sufficient cross-reacting Omicron antibodies to reduce COVID-19 incidence for 90 days post-Peak, but viral evolution and waning antibodies abrogate this protection thereafter.
Publisher DOI
Parent-daughter agreement about HPV vaccination status in Kenya and Malawi
Vaccine · 2025-03-28
articleOpen accessSenior author
BACKGROUND: As more countries introduce the HPV vaccine, it is important to understand the validity of vaccination measures. This is especially true in low- and middle-income countries (LMICs) where public health monitoring of vaccination data may have delays or gaps, so alternative measurement approaches are often necessary. Parental report is a common approach for measuring routine childhood vaccination, but it has not been evaluated for HPV vaccination in LMICs. METHODS: We conducted household surveys in Kenya (n = 146) and Malawi (n = 98) with parents/guardians and their daughters who were age-eligible for HPV vaccination. We compared parents'/guardians' reports of HPV vaccination status to daughters' reports; the latter was assumed to be the "gold standard" measure. RESULTS: 88 % of Kenyan parents/guardians and 82 % of Malawian parents/guardians agreed with their daughters' reported HPV vaccination status. It was more common for parents/guardians to under-report (i.e., to say their daughter was unvaccinated but the girl said she had received dose(s)) than the inverse. Agreement with one's daughter was higher among parents/guardians who reported data from vaccination cards versus using recall, and among parents/guardians who expressed more versus less confidence in their knowledge. We did not find many differences in accuracy of report by parent/guardian characteristics, although in Kenya there were small and statistically significant negative associations with parental age, household income, and more girls in the household (the latter was also significantly negatively associated with report accuracy in Malawi). CONCLUSIONS: In countries where surveys will commonly be used to measure HPV vaccination status, we found very high agreement of parents/guardians with their daughters' reported receipt of the vaccine. These results are similar to findings from the literature about routine childhood vaccination measurement. This suggests that researchers, clinicians, and practitioners can use parent/guardian-reported HPV vaccination of their daughter as a relatively good proxy of her own reported immunization status especially in settings without universal use of vaccination cards or registries.
Publisher DOI
Determining vaccine responders in the presence of baseline immunity using single-cell assays and paired control samples
ArXiv.org · 2025-07-08
preprintOpen access
A key objective in vaccine studies is to evaluate vaccine-induced immunogenicity and determine whether participants have mounted a response to the vaccine. Cellular immune responses are essential for assessing vaccine-induced immunogenicity, and single-cell assays, such as intracellular cytokine staining (ICS) are commonly employed to profile individual immune cell phenotypes and the cytokines they produce after stimulation. In this article, we introduce a novel statistical framework for identifying vaccine responders using ICS data collected before and after vaccination. This framework incorporates paired control data to account for potential unintended variations between assay runs, such as batch effects, that could lead to misclassification of participants as vaccine responders. To formally integrate paired control data for accounting for assay variation across different time points (i.e., before and after vaccination), our proposed framework calculates and reports two p-values, both adjusting for paired control data but in distinct ways: (i) the maximally adjusted p-value, which applies the most conservative adjustment to the unadjusted p-value, ensuring validity over all plausible batch effects consistent with the paired control samples' data, and (ii) the minimally adjusted p-value, which imposes only the minimal adjustment to the unadjusted p-value, such that the adjusted p-value cannot be falsified by the paired control samples' data. We apply this framework to analyze ICS data collected at baseline and 4 weeks post-vaccination from the COVID-19 Prevention Network (CoVPN) 3008 study. Our analysis helps address two clinical questions: 1) which participants exhibited evidence of an incident Omicron infection, and 2) which participants showed vaccine-induced T cell responses against the Omicron BA.4/5 Spike protein.
Publisher OA PDF DOI
Bridging the gap between design and analysis: randomization inference and sensitivity analysis for matched observational studies with treatment doses
Biometrics · 2025-10-08 · 1 citations
articleSenior author
Matching is a commonly used causal inference study design in observational studies. Through matching on measured confounders between different treatment groups, valid randomization inferences can be conducted under the no unmeasured confounding assumption, and sensitivity analysis can be further performed to assess robustness of results to potential unmeasured confounding. However, for many common matched designs, there is still a lack of valid downstream randomization inference and sensitivity analysis methods. Specifically, in matched observational studies with treatment doses (eg, continuous or ordinal treatments), with the exception of some special cases such as pair matching, there is no existing randomization inference or sensitivity analysis method for studying analogs of the sample average treatment effect (ie, Neyman-type weak nulls), and no existing valid sensitivity analysis approach for testing the sharp null of no treatment effect for any subject (ie, Fisher's sharp null) when the outcome is nonbinary. To fill these important gaps, we propose new methods for randomization inference and sensitivity analysis that can work for general matched designs with treatment doses, applicable to general types of outcome variables (eg, binary, ordinal, or continuous), and cover both Fisher's sharp null and Neyman-type weak nulls. We illustrate our methods via comprehensive simulation studies and a real data application. All the proposed methods have been incorporated into $\tt {R}$ package $\tt {doseSens}$.
Publisher DOI
Sensitivity Analysis for Binary Outcome Misclassification in Randomization Tests via Integer Programming
Journal of Computational and Graphical Statistics · 2025-02-04 · 1 citations
articleOpen access1st authorCorresponding
Conducting a randomization test is a common method for testing causal null hypotheses in randomized experiments. The popularity of randomization tests is largely because their statistical validity only depends on the randomization design, and no distributional or modeling assumption on the outcome variable is needed. However, randomization tests may still suffer from other sources of bias, among which outcome misclassification is a significant one. We propose a model-free and finite-population sensitivity analysis approach for binary outcome misclassification in randomization tests. A central quantity in our framework is "warning accuracy," defined as the threshold such that a randomization test result based on the measured outcomes may differ from that based on the true outcomes if the outcome measurement accuracy did not surpass that threshold. We show how learning the warning accuracy and related concepts can amplify analyses of randomization tests subject to outcome misclassification without adding additional assumptions. We show that the warning accuracy can be computed efficiently for large data sets by adaptively reformulating a large-scale integer program with respect to the randomization design. We apply the proposed approach to the Prostate Cancer Prevention Trial (PCPT). We also developed an open-source R package for implementation of our approach.
Publisher OA PDF DOI
nbpInference: Inference on Average Treatment Effects for Continuous Treatments
2025-10-17
datasetOpen access
Conduct inference on the sample average treatment effect for a matched (observational) dataset with a continuous treatment. Equipped with calipered non-bipartite matching, bias-corrected sample average treatment effect estimation, and covariate-adjusted variance estimation. Matching, estimation, and inference methods are described in Frazier, Heng and Zhou (2024) <<a href="https://doi.org/10.48550%2FarXiv.2409.11701" target="_top">doi:10.48550/arXiv.2409.11701</a>>.
Publisher OA PDF DOI

Frequent coauthors

Dylan S. Small
72 shared
Sameer K. Deshpande
51 shared
Shuchi Anand
51 shared
Yuzhou Lin
51 shared
Bo Zhang
Central South University
34 shared
Bo Zhang
Shanghai Eye Disease Prevention & Treatment Center
15 shared
Ting Ye
Shanghai Electric (China)
11 shared
Wendy Prudhomme O’Meara
Duke University
10 shared

Awards & honors

IPUMS Global Health Research Award for the Best Student Pape…
IMS Hannan Graduate Student Travel Award (2021)
ASA Mental Health Statistics Section Student Paper Award (20…
ENAR Distinguished Student Paper Award (2021)
Wellcome Trust Data Reuse Prize: Malaria (2019)

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Siyu Heng

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you