Marie Davidian
· ProfessorVerifiedNorth Carolina State University · Plant and Microbial Biology
Active 1982–2026
About
Marie Davidian is the J. Stuart Hunter Distinguished Professor and the Director of Faculty Grants at the Department of Statistics at NC State University. She earned her Ph.D. in Statistics from the University of North Carolina at Chapel Hill in 1987. Her areas of expertise include biostatistics, longitudinal analysis, and groups administration. She is recognized for her contributions to statistical research and education, particularly in biostatistics and longitudinal data analysis. Her professional role involves leading faculty initiatives and contributing to the advancement of statistical science within her department and the broader academic community.
Research topics
- Internal medicine
- Medicine
- Family medicine
- Nursing
- Oncology
- Psychiatry
- Physical therapy
- Biology
Selected publications
Bayesian adaptive randomization in the I-SPY2 sequential multiple assignment randomized trial
Biometrics · 2026-03-28
preprintOpen accessSenior authorI-SPY2 is a long-running phase 2 platform trial that evaluates neoadjuvant treatments for locally advanced breast cancer to identify those with high efficacy that are likely to be successful in phase 3 trials, assigning patients to novel agents using response-adaptive randomization (RAR). Recently, I-SPY2 was reconfigured as a sequential multiple assignment randomized trial (SMART), with up to three stages of therapy. At the first stage, a patient is assigned to a tumor-subtype-specific therapy. If the patient fails to show a satisfactory response, the patient is assigned to a second subtype-specific therapy, and receives a third, rescue therapy if response is still not achieved. The I-SPY2 SMART thus supports identification of highly efficacious entire treatment regimes. The transition of I-SPY2 to a SMART required development of a RAR scheme that updates randomization probabilities at each stage, aligned with the goal of maximizing the number of patients who achieve a pathological complete response (pCR). We present our Bayesian RAR approach, which updates randomization probabilities based on the posterior probability that treatments are part of the optimal regime. Empirical studies demonstrate that it results in more patients having treatment experience consistent with highly efficacious regimes, improves overall within-trial pCR rates, and identifies optimal regimes post trial at rates similar to or exceeding those under simple, uniform, nonadaptive randomization.
Epilepsia · 2026-01-24
articleOpen accessSenior authorOBJECTIVE: Nonadherence to antiseizure medications in pediatric epilepsy affects ~60% of youth. The aim of this multisite two-stage sequential, multiple assignment, randomized trial (SMART) was to evaluate the effectiveness of mHealth intervention strategies for improving adherence in caregivers of young children with epilepsy, particularly those who are underserved. It was hypothesized that participants initially randomized to treatment would exhibit significantly greater improvements in adherence compared to an active control group at the end of stage 1. Secondary outcomes included longitudinal adherence, seizure freedom/severity, and health-related quality of life. METHODS: Participants across four epilepsy centers were recruited (n = 461, mean age = 7.6 ± 3.0 years, 51% males, 62% White non-Hispanic, 71% underserved). Baseline questionnaires were completed, and electronic adherence monitors were provided. Three intervention strategies were embedded in this SMART: (1) active control (mHealth education + automated digital reminders), (2) treatment (mHealth education + automated digital reminders + individualized adherence feedback) in stage 1 continuing in stage 2 regardless of responsiveness in stage 1, and (3) treatment in stage 1 continuing in stage 2 if patient is responsive in stage 1 or augmented by problem-solving if patient is nonresponsive in stage 1. Active intervention was 5 months, with two posttreatment follow-ups. Standard analysis of covariance was conducted for the primary aim. Secondary analyses compared mean change from baseline associated with each intervention strategy. RESULTS: The treatment group had significantly better mean adherence percentage change from baseline to the end of stage 1 compared to the active control group (13.2% vs. 3.1% change, t = 2.82, p = .005, d = .37). Adherence rates declined over time across all SMART strategies. An increased probability of seizure freedom at 6 and 12 months across all SMART strategies was found. SIGNIFICANCE: This pediatric epilepsy SMART trial demonstrated adherence improvements for the treatment group that did not persist over time. Improvements in seizure freedom that were independent of SMART strategy were also identified. mHealth education, automated digital reminders, and individualized adherence feedback appear to have been efficacious.
Machine Learning Prediction of Disease Trajectories for Children with Juvenile Idiopathic Arthritis
medRxiv · 2026-04-20
articleABSTRACT Background Despite advances in therapy, optimal management of juvenile idiopathic arthritis (JIA) remains challenging. The ability to predict disease progression in JIA can improve personalized treatment decisions, but few reliable clinical predictors have been identified. We developed machine learning approaches to predict disease trajectories in children with JIA. Methods Using data from the Childhood Arthritis and Rheumatology Research Alliance (CARRA) Registry (years 2015-2024), we developed machine learning models to predict attainment of inactive disease in children with non-systemic JIA. We applied Dynamic Bayesian Networks (DBN) to model temporal dependencies and causal relationships, and Convolutional Neural Networks (CNN) to capture complex non-linear patterns. Model input included demographic factors, longitudinal clinical factors, and medication use in the preceding 12 months. Findings A total of 8,093 participants were included. When tested on an independent test cohort, both DBN (AUC:0.76; precision:0.73; recall:0.83; F1-score:0.78; accuracy:0.71) and CNN (AUC:0.76; precision:0.71; recall:0.63; F1-score:0.67; accuracy:0.70) models achieved comparable performance in predicting inactive disease. Disease activity levels in the preceding 12 months, presence of enthesitis and uveitis were the strongest predictors. Causal relationships captured in the DBN model revealed suboptimal care patterns, likely shaped by insurance constraints and a predominantly reactive approach to JIA management. Interpretation Our study demonstrates that machine learning approaches can predict disease trajectories in JIA with good discriminative performance. Unlike prior studies that predict outcomes at single timepoints, our models are the first to predict inactive disease longitudinally. However, suboptimal care patterns in retrospective data limit models’ capacity to learn treatment-outcome relationships, underscoring critical opportunities to improve JIA care and the need for prospective comparative studies to better inform prediction models. Funding Patient-Centered Outcomes Research Institute (PCORI) Award (ME-2022C2-25573-IC). RESEARCH IN CONTEXT Evidence before this study Numerous studies have sought to identify clinical predictors of JIA progression and outcomes. However, few reliable predictors have emerged and existing prediction models demonstrate limited performance. As a result, our ability to personalize treatment decisions based on individual risk of severe disease course remains limited. Added value of this study We developed novel machine learning models that predict individualized disease trajectories in children with polyarticular and oligoarticular JIA using data from their preceding 12-month clinical course. These models demonstrated strong discriminative performance and outperformed previously published machine learning approaches in JIA. Unlike prior studies limited to single time-point predictions, our models are the first to predict inactive disease longitudinally, enabling a patient-specific projection of disease progression over time. Importantly, our findings also bright to light patterns of suboptimal care, likely driven by insurance constraints and a reactive treatment paradigm, underscoring critical opportunities to improve JIA management. Implications of all the available evidence Our models have the potential to support clinical decision-making by enabling early identification of children with JIA at risk for unfavorable disease trajectories. In addition, the suboptimal care patterns and systems-level barriers identified through our analyses highlight priority areas for quality improvement initiatives and policy interventions to reduce gaps in JIA care delivery.
Nonlinear Mixed Effects Models
International Encyclopedia of Statistical Science · 2025-01-01
book-chapter1st authorCorrespondingIndependent Increments and Group Sequential Tests
Statistics in Medicine · 2025-11-01 · 1 citations
articleOpen accessSenior authorCorrespondingWidely used methods and software for group sequential tests of a null hypothesis of no treatment difference that allow for early stopping of a clinical trial depend primarily on the fact that sequentially-computed test statistics have the independent increments property. However, there are many practical situations where the sequentially-computed test statistics do not possess this property. Key examples are in trials where the primary outcome is a time to an event but where the assumption of proportional hazards is likely violated, motivating consideration of treatment effects such as the difference in restricted mean survival time or the use of approaches that are alternatives to the familiar logrank test, in which case the associated test statistics may not possess independent increments. We show that, regardless of the covariance structure of sequentially-computed test statistics, one can always derive linear combinations of these test statistics sequentially that do have the independent increments property. We also describe how to best choose these linear combinations to target specific alternative hypotheses, such as proportional or non-proportional hazards or log odds alternatives. We thus derive new, sequentially-computed test statistics that not only have the independent increments property, supporting straightforward use of existing methods and software, but that also have greater power against target alternative hypotheses than do procedures based on the original test statistics, regardless of whether or not the original statistics have the independent increments property. We illustrate with two examples.
Independent increments and group sequential tests
ArXiv.org · 2025-06-18
preprintOpen accessSenior authorWidely used methods and software for group sequential tests of a null hypothesis of no treatment difference that allow for early stopping of a clinical trial depend primarily on the fact that sequentially-computed test statistics have the independent increments property. However, there are many practical situations where the sequentially-computed test statistics do not possess this property. Key examples are in trials where the primary outcome is a time to an event but where the assumption of proportional hazards is likely violated, motivating consideration of treatment effects such as the difference in restricted mean survival time or the use of approaches that are alternatives to the familiar logrank test, in which case the associated test statistics may not possess independent increments. We show that, regardless of the covariance structure of sequentially-computed test statistics, one can always derive linear combinations of these test statistics sequentially that do have the independent increments property. We also describe how to best choose these linear combinations to target specific alternative hypotheses, such as proportional or non-proportional hazards or log odds alternatives. We thus derive new, sequentially-computed test statistics that not only have the independent increments property, supporting straightforward use of existing methods and software, but that also have greater power against target alternative hypotheses than do procedures based on the original test statistics, regardless of whether or not the original statistics have the independent increments property. We illustrate with two examples.
Optimal treatment strategies for prioritized outcomes
arXiv (Cornell University) · 2024-07-08
preprintOpen accessDynamic treatment regimes formalize precision medicine as a sequence of decision rules, one for each stage of clinical intervention, that map current patient information to a recommended intervention. Optimal regimes are typically defined as maximizing some functional of a scalar outcome's distribution, e.g., the distribution's mean or median. However, in many clinical applications, there are multiple outcomes of interest. We consider the problem of estimating an optimal regime when there are multiple outcomes that are ordered by priority but which cannot be readily combined by domain experts into a meaningful single scalar outcome. We propose a definition of optimality in this setting and show that an optimal regime with respect to this definition leads to maximal mean utility under a large class of utility functions. Furthermore, we use inverse reinforcement learning to identify a composite outcome that most closely aligns with our definition within a pre-specified class. Simulation experiments and an application to data from a sequential multiple assignment randomized trial (SMART) on HIV/STI prevention illustrate the usefulness of the proposed approach.
Biostatistics · 2024-02-09 · 2 citations
articleOpen accessClinicians and patients must make treatment decisions at a series of key decision points throughout disease progression. A dynamic treatment regime is a set of sequential decision rules that return treatment decisions based on accumulating patient information, like that commonly found in electronic medical record (EMR) data. When applied to a patient population, an optimal treatment regime leads to the most favorable outcome on average. Identifying optimal treatment regimes that maximize residual life is especially desirable for patients with life-threatening diseases such as sepsis, a complex medical condition that involves severe infections with organ dysfunction. We introduce the residual life value estimator (ReLiVE), an estimator for the expected value of cumulative restricted residual life under a fixed treatment regime. Building on ReLiVE, we present a method for estimating an optimal treatment regime that maximizes expected cumulative restricted residual life. Our proposed method, ReLiVE-Q, conducts estimation via the backward induction algorithm Q-learning. We illustrate the utility of ReLiVE-Q in simulation studies, and we apply ReLiVE-Q to estimate an optimal treatment regime for septic patients in the intensive care unit using EMR data from the Multiparameter Intelligent Monitoring Intensive Care database. Ultimately, we demonstrate that ReLiVE-Q leverages accumulating patient information to estimate personalized treatment regimes that optimize a clinically meaningful function of residual life.
Nature Medicine · 2024-09-14 · 9 citations
articleOpen accessarXiv (Cornell University) · 2024-03-25
preprintOpen accessSenior authorThe sequential multiple assignment randomized trial (SMART) is the ideal study design for the evaluation of multistage treatment regimes, which comprise sequential decision rules that recommend treatments for a patient at each of a series of decision points based on their evolving characteristics. A common goal is to compare the set of so-called embedded regimes represented in the design on the basis of a primary outcome of interest. In the study of chronic diseases and disorders, this outcome is often a time to an event, and a goal is to compare the distributions of the time-to-event outcome associated with each regime in the set. We present a general statistical framework in which we develop a logrank-type test for comparison of the survival distributions associated with regimes within a specified set based on the data from a SMART with an arbitrary number of stages that allows incorporation of covariate information to enhance efficiency and can also be used with data from an observational study. The framework provides clarification of the assumptions required to yield a principled test procedure, and the proposed test subsumes or offers an improved alternative to existing methods. We demonstrate performance of the methods in a suite of simulation studies. The methods are applied to a SMART in patients with acute promyelocytic leukemia.
Recent grants
Integrated Biostatistical Training for CVD Research
NIH · $3.9M · 2006–2027
NIH · $375k · 2000
Statistical Methods for Cancer Clinical Trials
NIH · $37.5M · 2010–2021
NIH · $3.1M · 2016
Frequent coauthors
- 141 shared
Anastasios A. Tsiatis
North Carolina State University
- 85 shared
Robert A. Harrington
Cornell University
- 49 shared
Paul W. Armstrong
Canadian VIGOUR Centre
- 49 shared
Neal S. Kleiman
Methodist Hospital
- 49 shared
Wei-Ching Chang
University of Alberta
- 49 shared
Jeffrey Griffin
Christ Hospital
- 49 shared
Vic Hasselblad
- 49 shared
Karen S. Pieper
Clinical Research Institute
Labs
Department of StatisticsPI
Education
- 1987
PhD, Statistics
University of North Carolina at Chapel Hill
Awards & honors
- 2011-2012, Dr. D. D. Mason Award
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Marie Davidian
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup