Jacqueline M. Hughes-Oliver
· ProfessorNorth Carolina State University · Plant and Microbial Biology
Active 1990–2026
Selected publications
Frontiers in Veterinary Science · 2026-05-07
articleOpen accessIntroduction Flunixin meglumine is a non-steroidal anti-inflammatory drug (NSAID) commonly used extra-label in goats, necessitating the determination of an extended withdrawal interval (WDI) to minimize the risk of violative residues at slaughter. Current U.S. Food and Drug Administration (FDA) guidance estimates WDIs using univariate ordinary least squares (OLS) regression applied to concentrations at or above the limit of detection (LOD), defining the WDI as the time at which the upper bound of the 95% confidence interval for the 99% quantile falls below a specified tolerance. However, residue concentrations measured across multiple tissues from the same animal may be correlated, and excluding observations below the LOD may distort estimates by removing information from the terminal depletion phase. Materials and methods We propose a multivariate linear regression (MvLR) framework that jointly models inter-tissue dependence while accommodating left-censored observations. Regression parameters are estimated using OLS and generalized least squares (GLS) under the uncensored MvLR model, and via an expectation conditional-maximization (ECM) algorithm under a censored MvLR formulation. Withdrawal intervals are computed using the multivariate t-distribution to obtain the upper limit of the 95% confidence interval for the 99% quantile across tissues. The methods are illustrated using tissue-residue data from 20 Boer goats administered flunixin meglumine at 2.2 mg/kg, with five animals euthanized at each of four post-treatment time points (24, 48, 72, and 96 h) and are further evaluated in a simulation study. Results The simulation results indicate that the ECM-based censored MvLR approach yields stable parameter estimation and reliable WDI inference in the presence of censoring. Applying this framework to the goat residue data suggests that a withdrawal interval of at least 10 days is recommended to ensure that residues across all tissues fall below conservative safety thresholds. Discussion These findings suggest that a multivariate censored modeling framework can improve WDI estimation by accounting for inter-tissue correlation and incorporating observations below the LOD, addressing key limitations of current univariate FDA-style approaches.
Journal of Chemical Information and Modeling · 2025-09-11 · 38 citations
reviewOpen accessMachine Learning (ML) methods that relate molecular structure to properties are frequently proposed as in silico surrogates for expensive or time-consuming experiments. In small molecule drug discovery, such methods inform high-stakes decisions like compound synthesis and in vivo studies. This application lies at the intersection of multiple scientific disciplines. When comparing new ML methods to baseline or state-of-the-art approaches, statistically rigorous method comparison protocols and domain-appropriate performance metrics are essential to ensure replicability and ultimately the adoption of ML in small molecule drug discovery. This paper proposes a set of guidelines to incentivize rigorous and domain-appropriate techniques for method comparison tailored to small molecule property modeling. These guidelines, accompanied by annotated examples using open-source software tools, lay a foundation for robust ML benchmarking and thus the development of more impactful methods.
ChemRxiv · 2024-11-04 · 4 citations
preprintOpen accessMachine Learning (ML) methods that relate molecular structure to properties are frequently proposed as in-silico surrogates for expensive or time-consuming experiments. In small molecule drug discovery, such methods inform high-stakes decisions like compound synthesis and in-vivo studies. This application lies at the intersection of multiple scientific disciplines. When comparing new ML methods to baseline or state-of-the-art approaches, statistically rigorous method comparison protocols and domain-appropriate performance metrics are essential to ensure replicability and ultimately the adoption of ML in small molecule drug discovery. This paper proposes a set of guidelines to incentivize rigorous and domain-appropriate techniques for method comparison tailored to small molecule property modeling. These guidelines, accompanied by annotated examples and open-source software tools, lay a foundation for robust ML benchmarking and thus the development of more impactful methods.
ChemRxiv · 2024-11-06 · 11 citations
preprintOpen accessMachine Learning (ML) methods that relate molecular structure to properties are frequently proposed as in-silico surrogates for expensive or time-consuming experiments. In small molecule drug discovery, such methods inform high-stakes decisions like compound synthesis and in-vivo studies. This application lies at the intersection of multiple scientific disciplines. When comparing new ML methods to baseline or state-of-the-art approaches, statistically rigorous method comparison protocols and domain-appropriate performance metrics are essential to ensure replicability and ultimately the adoption of ML in small molecule drug discovery. This paper proposes a set of guidelines to incentivize rigorous and domain-appropriate techniques for method comparison tailored to small molecule property modeling. These guidelines, accompanied by annotated examples and open-source software tools, lay a foundation for robust ML benchmarking and thus the development of more impactful methods.
Confidence Bands and Hypothesis Tests for Hit Enrichment Curves
Research Square · 2022-02-21
preprintOpen accessSenior authorAbstract In virtual screening for drug discovery, hit enrichment curves are widely used to assess the performance of ranking algorithms with regard to their ability to identify early enrichment. Unfortunately, researchers almost never consider the uncertainty associated with estimating such curves before declaring differences between performance of competing algorithms. Appropriate inference is complicated by two sources of correlation that are often overlooked: correlation across different testing fractions within a single algorithm, and correlation between competing algorithms. Additionally, researchers are often interested in making comparisons along the entire curve, not only at a few testing fractions. We develop inferential procedures to address both the needs of those interested in a few testing fractions, as well as those interested in the entire curve. For the former, four hypothesis testing and (pointwise) confidence intervals are investigated, and a newly developed EmProc approach is found to be most effective. For inference along entire curves, EmProc-based confidence bands are recommended for simultaneous coverage and minimal width. While we focus on the hit enrichment curve, this work is also appropriate for lift curves that are used throughout the machine learning community. Our inferential procedures trivially extend to enrichment factors, as well.
Confidence bands and hypothesis tests for hit enrichment curves
Journal of Cheminformatics · 2022-07-28 · 3 citations
articleOpen accessSenior authorIn virtual screening for drug discovery, hit enrichment curves are widely used to assess the performance of ranking algorithms with regard to their ability to identify early enrichment. Unfortunately, researchers almost never consider the uncertainty associated with estimating such curves before declaring differences between performance of competing algorithms. Uncertainty is often large because the testing fractions of interest to researchers are small. Appropriate inference is complicated by two sources of correlation that are often overlooked: correlation across different testing fractions within a single algorithm, and correlation between competing algorithms. Additionally, researchers are often interested in making comparisons along the entire curve, not only at a few testing fractions. We develop inferential procedures to address both the needs of those interested in a few testing fractions, as well as those interested in the entire curve. For the former, four hypothesis testing and (pointwise) confidence intervals are investigated, and a newly developed EmProc approach is found to be most effective. For inference along entire curves, EmProc-based confidence bands are recommended for simultaneous coverage and minimal width. While we focus on the hit enrichment curve, this work is also appropriate for lift curves that are used throughout the machine learning community. Our inferential procedures trivially extend to enrichment factors, as well.
Leadership and Diversity in Statistics: Great Initiatives by Faculty Advocates
2021-01-01
book-chapterSenior author2020-09-07
article1st authorCorrespondingarXiv (Cornell University) · 2019-12-19
preprintOpen accessSenior authorIn virtual screening for drug discovery, recall curves are used to assess the performance of ranking algorithms, in which recall is a function of the fraction of data prioritized for experimental testing. Unfortunately, researchers almost never consider the uncertainty in the estimation of the recall curve when benchmarking algorithms. We confirm that a recently developed procedure for estimating pointwise confidence intervals for recall curves -- and closely related variants, such as precision curves -- can be applied to a variety of simulated data sets representative of those typically encountered in virtual screening. Since it is more desirable in benchmarks to present the uncertainty of performance over a range of testing fractions, we extend the pointwise confidence interval procedure to allow for the estimation of confidence bands for these curves. We also present hypothesis test methods to determine significant differences between the curves for competing algorithms. We show these methods have high power to detect significant differences at a range of small fractions typically tested, while maintaining control of type I error rate. These methods enable statistically rigorous comparisons of virtual screening algorithms using a metric that quantifies the aspect of performance that is of primary interest.
Assessment of Prediction Algorithms for Ranking Objects
Notices of the American Mathematical Society · 2019-01-16 · 2 citations
articleOpen access1st authorCorrespondingPrediction algorithms are everywhere Based on known or observable features of a set of objects, these algorithms provide informed guesses about unknown or unobserved outcomes of the same set of objects. For this article, interest is limited to a single unknown outcome that can take one of two states: positive or negative. In the context of credit scoring, positive may indicate that an applicant for a loan is deemed likely to default on the loan. In the context of screening biomarkers, positive may indicate that a biomarker is strongly associated with a particular disease. In the context of information retrieval, positive may indicate that the returned item is relevant for the request. And in the context of drug discovery in the pharmaceutical industry, positive may indicate that a compound has scored well in a biological assay and has been cleared for followup testing as an active or hit compound. For these and
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Jacqueline M. Hughes-Oliver
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup