Lucas Pinto

· Assistant Professor of Neurobiology Assistant Professor of Neuroscience Institute Committee on Computational Neuroscience Committee on NeurobiologyVerified

University of Chicago · Pharmacology

Active 2010–2025

h-index30

Citations4.6k

Papers247198 last 5y

Funding—

Faculty page Lab page Website

See your match with Lucas Pinto — sign in to PhdFit.Sign in

About

Welcome to the Pinto Lab at The University of Chicago! We are part of the Department of Neurobiology. We want to understand how neural circuits across many brain areas interact to support decision making. In particular, how are these interactions flexibly reconfigured when animals make decisions that use different underlying cognitive processes? To do this we combine high-throughput behavior in virtual reality, optical, electrophysiological and genetic tools to measure and manipulate the dynamics of single neurons and neuronal populations, and computational approaches to understand both the behavior and its relationship to neural activity.

Research topics

Medicine
Internal medicine
Intensive care medicine
Anesthesia
Pathology
Emergency medicine

Selected publications

Consensus recommendations for early critical care management of paediatric HCT and IEC recipients: what comes next?
The Lancet Child & Adolescent Health · 2025-05-14
articleSenior author
Publisher DOI
The 2024 Phoenix Sepsis Score Criteria: Part 1, the Evolution in Definition of Sepsis and Septic Shock
Pediatric Critical Care Medicine · 2025-02-01 · 18 citations
articleOpen access
Publisher OA PDF DOI
We Have New Sepsis Criteria for Children…Now What?
Hospital Pediatrics · 2025-06-23 · 1 citations
letterOpen access
Sepsis in children continues to be a major public health problem, with millions of children affected by death and disability every year worldwide.1–5 Defining pediatric sepsis, however, has been problematic. The previous criteria developed by the International Pediatric Sepsis Consensus Conference in 20056 based on systemic inflammatory response syndrome (SIRS) criteria had low specificity and low sensitivity, did not allow for risk stratification in both lower- and higher-resource settings, and were discordant with clinician-based diagnosis of sepsis.7,8 In 2019, the Society of Critical Care Medicine (SCCM) Pediatric Sepsis Definition Task Force was convened to update the pediatric sepsis definition and the operational criteria to diagnose sepsis. The Task Force included a diverse group of experts, including specialists in critical care, emergency medicine, infectious disease, nursing, pharmacists, and informatics from both high- and low-resource settings. For this endeavor, the Task Force performed an international survey of almost 3000 clinicians and conducted a large systematic review of associations between markers of organ dysfunction and outcome in children with infection.9 Informed by the findings from the survey and the systematic review, they applied interpretable machine learning models to pediatric electronic health record (EHR) data: 3.6 million encounters from 10 hospitals in 5 countries. The goal was to operationalize the concept of sepsis as a “suspected infection with life-threatening organ dysfunction”8 by deriving and validating models to discriminate mortality using individual organ dysfunction subscores among children with suspected infection.3 The 2 best-performing models were then evaluated by the Task Force. One model, which included variables for cardiovascular, respiratory, neurologic, and coagulation organ dysfunction, was selected via a modified Delphi consensus process given its greater simplicity and lower dependence on laboratory values. This model was then translated into an integer-based score, the Phoenix Sepsis Score (PSS), that was found to have better performance than previous organ-scoring systems at nearly all clinical sites. Again, through a Delphi consensus process, members of the Task Force chose a PSS of 2 or greater as the threshold for sepsis in children with suspected or confirmed infection and sepsis plus at least 1 cardiovascular point to define septic shock. The new Phoenix criteria and score were presented in the 2024 SCCM Annual Congress in Phoenix, Arizona, and published simultaneously in a pair of articles in the Journal of the American Medical Association.5,8The diagnosis and management of sepsis can occur at any point along the acute care continuum: prehospital, the emergency department, the acute care wards, or the intensive care unit. Importantly, sepsis is not a singular phenomenon that occurs at a single time point. Generally, the course of sepsis and, by extension, its management, can be divided into 2 time periods: prediagnosis and postdiagnosis. In the prediagnosis stage, sepsis must be screened for among patients at risk, and those who achieve a high enough level of suspicion for sepsis (either based on clinician judgment alone or clinician judgment informed by an algorithm or a data-driven tool) should receive empirical therapy and organ support as needed until a diagnosis of sepsis is confirmed or refuted. The management of sepsis should be guided by clinical best practice guidelines, such as the Surviving Sepsis Guidelines for Children,10 including fluid balance, antimicrobial duration, nutrition initiation, blood product administration, vasoactive use, need for extracorporeal therapies, etc. The Phoenix criteria for sepsis were designed to define the transition between prediagnosis and postdiagnosis and not for use in decision-making in the prediagnosis/screening phase. Developing accurate early screening criteria remains an active area of work and improvement in all acute care settings. The new criteria were designed instead to define the start of a postdiagnosis phase that is more concordant with clinician diagnosis of sepsis. This consistency can be leveraged to implement the best clinical practices for management. The new criteria will also facilitate more consistent patient cohorts for benchmarking, quality improvement, and research in all acute care settings, including the acute care wards, an improvement over SIRS-based criteria. Even with increased consistency in patient identification, sepsis is increasingly being recognized as having potential subtypes that may have different risks or responses to treatments.11 With new diagnostic criteria now in place, an important next focus of study should be the definition of these subtypes of sepsis. Although a central tenet of the SCCM Pediatric Sepsis Task Force was that organ dysfunction is relevant for any form of sepsis, it is important to assess the performance of the Phoenix criteria and PSS in subgroups of infected patients who may differ from the general population of patients with infection and sepsis.In this edition of Hospital Pediatrics, Tripathi et al present a retrospective cohort study describing the prevalence of sepsis by Phoenix criteria and validating the performance of the PSS to predict mortality in children with suspected or confirmed COVID-19 or Multisystem Inflammatory Syndrome in Children (MIS-C).12 They did this by performing a secondary analysis of the SCCM Viral Infection and Respiratory Illness Universal Study (VIRUS). COVID-19 and MIS-C are potentially distinct subgroups of infection-related conditions that were not included in the original Phoenix cohort of patients who can receive different treatments than the general population of sepsis, including immunomodulators and novel antivirals. Despite these potential differences, the Phoenix criteria and the PSS were able to discriminate most of the children with COVID-19 or MIS-C who died during their hospitalization. There was, however, a notable difference between the VIRUS cohort compared with the original Phoenix data set. The mortality was overall lower in the VIRUS cohort (sepsis 4.6%; septic shock 5.4%) than among the children meeting Phoenix sepsis criteria (for high-resource settings: sepsis 7.1%; septic shock 10.8%), which resulted in a relatively higher area under the receiver operating characteristic curve and lower area under the precision-recall curve (AUPRC) in the VIRUS cohort. Notably, an even lower mortality rate was seen in children with MIS-C (2.5%), despite MIS-C pathophysiology often being associated with significant cardiovascular dysfunction. There are several potential explanations for these findings. First, it is possible that once identified, specific treatments (ie, antivirals, immunomodulators) for COVID-19 and/or MIS-C reduced mortality risk more effectively than typical therapies do for other forms of sepsis and septic shock. Second, it is possible that the VIRUS cohort did contain enough positive cases to validate the performance of the criteria with confidence. Fewer than 100 events (deaths) occurred in the VIRUS cohort, below the threshold recommended by experts for validation of models and scores.13 Third, the source of the VIRUS data is a registry, rather than directly from EHR systems as in the Phoenix data set. This is important because the PSS was derived and validated using the highest values in the first 24 hours of an encounter. The current study assumes that the values needed to capture the highest possible score in a 24-hour timeframe have been entered into the registry to mimic data collection appropriately and that the 24 hours entered into the registry were also the first 24 hours of the encounter, as there are no timestamps in the VIRUS registry. Fourth, increased vigilance during the pandemic might have led to faster detection and treatment of COVID-19 and/or MIS-C in children, thereby resulting in lower mortality. Finally, during the pandemic, there was a reduced exposure to other infections, and so a reduced number of coinfections could have made these patients lower risk.Although recent articles have found that Phoenix criteria had strong test characteristics in children identified with suspected sepsis in the emergency department,14 transported to ICU,15 and hospitalized with cancer,16 this study in a lower-mortality population of children with COVID-19 and MIS-C found it had a lower AUPRC. Nevertheless, this study supports the central tenet that organ dysfunction is relevant for discriminating risk of mortality in any form of sepsis and that the Phoenix criteria and the PSS may be useful tools to assess risk of mortality for children with any type of suspected infection or infection-related condition. This study adds to the important work being done to evaluate the test characteristics of the Phoenix criteria in relevant subpopulations. This includes children who develop sepsis in the inpatient wards. In a recently published study, we found that in patients who were admitted to the inpatient ward for more than 72 hours prior to pediatric intensive care unit (PICU) admission, had a suspected hospital-acquired infection, and met Phoenix sepsis criteria within 24 hours of PICU admission, mortality was substantially higher than the overall PICU sepsis mortality rate (19.9% vs 9%).17 How we incorporate that information into our practice throughout the spectrum of sepsis care across the acute care continuum remains the next big question ripe for investigation: (1) How will the Phoenix sepsis criteria and the PSS drive interventions, and how does this protocolized sepsis care intersect with other important values like resource stewardship? (2) How will the Phoenix sepsis criteria and PSS integrate into established rapid response systems? (3) How does the score perform in acute care environments when included tests may not be ordered, documentation may be delayed, and new onset sepsis is rare? (4) How will we handle the repeated measures and alert challenges of transitioning from a score that was derived based on the worst value in a 24-hour period to implementations that result in continuously recalculated scores with new EHR data?18 We look forward to the important Phoenix sepsis criteria and PSS work to come.
Publisher OA PDF DOI
External Validation, Re-Calibration, and Extension of a Prediction Model of Early Acute Kidney Injury in Critically Ill Children using Multi-Center Data
medRxiv · 2025-02-06
preprintOpen accessSenior author
Background: Acute kidney injury (AKI) is common among children with critical illness and is associated with high morbidity and mortality. Risk prediction models designed for clinical decision support implementation offer an opportunity to identify and proactively mitigate AKI risks. Existing models have been primarily validated on single-center data, owing partly to the lack of appropriately detailed multicenter datasets. Objective: To determine the accuracy of a single-center model to predict new AKI at 72 hours of ICU admission across two multicenter datasets and extend this model to improve prediction accuracy while maintaining acceptable alert burden. Derivation and Validation Cohorts: We separately derived models in two datasets: PEDSNET-VPS, created through the linkage of PEDSnet electronic health record (EHR) extraction with Virtual Pediatric Systems (VPS); and the PICU Data Collaborative dataset, created through EHR extraction and harmonization from eight participating institutions. Derivation datasets comprised temporal and location-specific spit of these datasets (80%), while the holdout test split comprised the remaining (20%). Prediction Model: We recalibrated an existing single-center model and measured discrimination and accuracy. We then add features guided by precision and recall measures. All features were available at 12 hours of ICU admission. We measure discrimination and accuracy at multiple cut-points and identify the features contributing most to the risk score. Results: In two datasets comprising 186,540 ICU admissions, we report an incidence of early AKI of 2.2 - 2.7%. Initial recalibration of an existing single-center model demonstrated poor discrimination (AUROC 0.60 - 0.78). Following the addition of new features, we report higher AUROC values of 0.79 - 0.80 and AUPRC values of 0.13 - 0.21 in both datasets. We report accuracy at several cutpoints as well as cross-validate between datasets. Conclusions: In this first use of two new multicenter datasets, we report improved discrimination and accuracy in a model designed specifically for implementation, balancing sensitivity and precision to predict patients at risk for AKI development.
Publisher OA PDF DOI
Group Peer Mentoring: A Strategy to Promote Career Development and Improve Well-Being Among Early-Career Faculty in Pediatric Critical Care Medicine
Pediatric Critical Care Medicine · 2025-05-15
article1st authorCorresponding
Publisher DOI
Management Practices of CAR T-cell-Related Inflammatory Toxicities: A Survey of Pediatric CAR T-cell Providers
Transplantation and Cellular Therapy · 2025-10-01 · 4 citations
articleOpen access
Publisher OA PDF DOI
The authors reply:
Pediatric Critical Care Medicine · 2025-07-02 · 1 citations
articleOpen access
Publisher OA PDF DOI
Predicting pediatric cardiac arrest outcomes using early quantitative EEG
Resuscitation · 2025-09-25 · 1 citations
articleOpen access
Publisher OA PDF DOI
Use of the Area Under the Precision-Recall Curve to Evaluate Prediction Models of Rare Critical Illness Events
Pediatric Critical Care Medicine · 2025-04-29 · 15 citations
articleOpen accessSenior author
In the critical care setting, where missed diagnoses and false alarms can have significant consequences, evaluating the performance of a classification (or binary prediction) model must be nuanced, critical, and thorough. Traditionally, the area under the receiver operating characteristic (AUROC) curve has been used as the primary evaluation metric for most models. The receiver operating characteristic (ROC) curve is a plot of a model’s sensitivity (also known as recall or true positive rate) and specificity (true negative rate) at each possible model output threshold (typically a score or probability). The AUROC, thus, provides a global evaluation of a model’s ability to discriminate between the two outcome classes by summarizing the sensitivity–specificity tradeoff over all model output thresholds. However, AUROC may be misinterpreted clinically when the outcome of interest is rare. Such scenarios are referred to as “imbalanced” because of the large ratio of negative to positive (or vice versa) cases. In these situations, a model may have a high AUROC as a result of the robust true negative rate (specificity) (1,2) yet be unable to reliably identify positive cases (Fig. 1). Accordingly, AUROC may not give clinicians the insights they seek when deciding whether a classification model will be clinically useful (3).Figure 1.: Example impact of outcome class imbalance on model predictions. This figure shows how application of model predictions to balanced (A) and imbalanced (B) outcome sets can affect the distribution of true positive (TP) and false positive (FP) and false negative (FN) predictions. While the true negative (TN) rate (specificity) remains high in the imbalanced set, the ratio of FP predictions to TP predictions has increased leading to a low positive predictive value and a potentially less useful model.Many events of interest in the critical care setting, including mortality, clinical deterioration, and acute kidney injury, are imbalanced in our datasets because only a minority of our patients suffer these events (< 10–20%). In such situations, the precision-recall (PR) curve and area under the PR curve (AUPRC) offer more clinically relevant and operationally useful measures of performance. The sensitivity (recall) of a model indicates the proportion of all positive cases the model can identify. The positive predictive value (PPV or precision) is the proportion of positive predictions that are correct. Both are intuitive concepts for clinicians. By measuring the tradeoff between sensitivity and PPV, AUPRC aligns better with what we care about in imbalanced problems: reliable identification of a rare event. Further, PR curves can illustrate the operational implications of model deployment in critical care settings. Here, we demonstrate the utility of AUPRC and PR curves when predicting rare critical illness events using simulated pediatric data. A PREDICTION MODEL EXAMPLE USING SIMULATED CRITICAL CARE DATA We synthesized clinical data for 200,000 virtual pediatric patients presenting with diabetic ketoacidosis and built prediction models for the outcome of cerebral edema (CE). Input variables, distributions, and CE prevalence were chosen to reflect observational studies (4–6). Virtual patients were randomly divided into training (80%) and test (20%) sets, and logistic regression (LR), random forest (RF), and extreme gradient boosting (XGBoost) models were trained to predict CE. We calculated AUROC and AUPRC and plotted ROC and PR curves using the pROC and PRROC packages (which use piecewise trapezoidal integration to determine area under the curve). AUROC and AUPRC comparisons and 95% CIs were computed using bootstrapping methods. The R software (Version 4.4.1; Vienna, Austria) we used is described in the Supplement (https://links.lww.com/PCC/C622). The AUROCs for the LR, RF, and XGBoost models were 0.953 (95% CI, 0.939–0.964), 0.874 (0.851–0.897), and 0.947 (0.939–0.964), respectively, while AUPRCs were 0.116 (0.095–0.142), 0.083 (0.068–0.102), and 0.096 (0.082–0.112), respectively (Fig. 2). Importantly, interpretation of AUPRC and each model’s value is best done in relation to the frequency of the event of interest (specifically, by dividing AUPRC by the frequency of that event).Figure 2.: Receiver operator characteristic (ROC) and precision-recall (PR) curves for simulation prediction models. This figure shows ROC (A) and PR (B) curves and associated area under the ROC curve (AUROC) and area under the PR curve (AUPRC) values (with 95% CIs) for three models (logistic regression, random forest, and extreme gradient boosting [XGBoost]) predicting cerebral edema in the simulated population test set. We have reversed the X-axis in A, which is more commonly shown as 1–specificity. PPV = positive predictive value.Dividing the LR model AUPRC (0.116) by the CE outcome frequency (0.007), we find the LR model is 16.6-times more useful than a random model. Additionally, as observed in the PR curve in Figure 2B, a model threshold chosen to achieve a sensitivity of 0.85–0.90 would potentially result in a PPV (precision) 5–10% higher for the LR and XGBoost models than the RF model. None of this is readily apparent in the ROC curve (Fig. 2A). Finally, although the LR and XGBoost models had similar AUROC, the LR model AUPRC was statistically significantly higher, primarily as a result of its improved PPV at lower sensitivities. In such a scenario, sole use of AUROC and ROC curves for model selection could lead to selection of a model with a PPV-sensitivity tradeoff that does not provide the clinical utility required by the clinical team. CAPTURING WHAT MATTERS: SPECIFICITY VS. POSITIVE PREDICTIVE VALUE High sensitivity is important for prediction of critical illness events, including CE, because failing to identify events could have severe consequences for patients. As both AUROC and AUPRC use sensitivity in their measures, the main difference between AUROC and AUPRC is use of specificity vs. PPV, respectively. In imbalanced datasets, where negative cases are abundant, this difference becomes consequential (2). When the number of true negatives vastly exceeds true positives, a model can have a high specificity (and AUROC) by labeling all cases as negative, but this would have no clinical value. A model that labels most cases as negative may have limited ability to detect the rarer positive cases. In our simulation, all three models have excellent AUROC over 0.85. However, in an application where a sensitivity of 50% is deemed necessary, the resulting PPV will be less than 0.2, indicating less than one in five positive predictions will be correct. While this tradeoff may still be acceptable to the clinician, it is a much more sober assessment of clinical utility than AUROC suggests. Furthermore, a small change in model performance that converts a true positive into a false negative has only a small effect on specificity in imbalanced problems due to the overwhelming number of true negatives that partially mask this drop in performance. PPV will change more markedly in this situation, offering a more transparent measure of performance when some data points have high leverage. As illustrated in Figure 1, a “miss” by the model (a change from a true positive to a false negative) has an immediate, noticeable impact on PPV, without masking from true negatives. OPERATIONAL RELEVANCE OF AUPRC IN CRITICAL CARE SETTINGS When deploying a model to detect or predict critical illness events, the primary goals are often two-fold: minimizing the number of missed positive cases (high sensitivity) and ensuring clinicians are not overwhelmed by false positive alerts. A high false positive rate diminishes clinicians’ trust in an alert and increases the time to respond (7). The PR curve illustrates these priorities effectively allowing one to gauge, at different sensitivities, how well the model maximizes PPV and minimizes the “number needed to alert” (NNA), defined as 1/PPV (8). The lower the PPV, the higher the number of false positive predictions and larger the NNA. Both PPV and NNA are intuitive operational metrics for clinicians. NNA reflects the burden of false positives expected on the clinical teams responding to the alerts for each correct prediction, which is crucial in balancing the cost of alert fatigue with the benefit of accurate predictions. By illustrating what PPV is attainable at different sensitivities, clinicians can judge whether a model’s performance aligns with an acceptable threshold for operational use. Returning to our simulation, LR model PPV is greater than 0.25 only at low (< 0.05) sensitivities. Conversely, at sensitivities near 0.90, PPV ranges from 0.15 to 0.2. By examining this curve, the clinician can determine the impact of prioritizing PPV vs. sensitivity and consider the operational implications through discussion of the NNA. A clinician may decide, for example, that a threshold that provides a sensitivity of 0.90 and NNA of 6–7 is or is not acceptable depending on the actions involved (e.g., high-risk therapy vs. more intense monitoring). In summary, inspection of PR curves allows the clinician to evaluate if a prediction model is appropriate for clinical use and enables selection of a probability threshold suitable for the prediction task and clinical environment. Although inspection of the ROC curve remains a key component of predictive model evaluation, it does not provide the same operational insights as the PR curve because the tradeoff between sensitivity and 1-specificity can be more challenging to translate into real-world clinical scenarios. While AUPRC is advantageous for evaluating models on imbalanced datasets, it is important to recognize the potential challenges. AUPRC can be sensitive to small changes in precision and recall, especially when positive cases are extremely rare. Therefore, it is important to complement AUPRC with visual inspection of PR curves to ensure model and threshold selection match the priorities of the prediction task. Further, it is important to note that a model that consistently outperforms others across all thresholds in ROC space will also outperform these models in PR space. CONCLUSIONS AUPRC and the PR curve are informative, practical metrics that should be considered when assessing prediction models for rare clinical events. By prioritizing PPV over specificity, AUPRC aligns with the operational priorities of critical care clinicians, offering a clearer picture of model performance and insight on interpreting positive predictions. While AUROC is an important metric and should be used as one of several measures of discrimination, we recommend investigators working with imbalanced prediction problems in critical care use AUPRC and the PR curve to determine how a model will perform when applied clinically at various thresholds, considering the tradeoff between sensitivity and PPV.
Publisher DOI
Phoenix Sepsis Score Calculator Mobile Application (Preprint)
2025-06-30
preprint
<sec> <title>UNSTRUCTURED</title> We developed two novel mobile applications (Android and iOS) and a web-calculator based on the new Phoenix sepsis criteria and the Phoenix Sepsis Score (PSS). We used human-centered design methods to design, implement, and iteratively improve the applications. </sec>
Publisher DOI

Frequent coauthors

R. Scott Watson
University of Washington
1380 shared
Niranjan Kissoon
Université de Montréal
1350 shared
Tellen D. Bennett
Children's Hospital Colorado
1346 shared
James L. Wynn
1343 shared
Jerry J. Zimmerman
Seattle Children's Hospital
1338 shared
Luregn J. Schlapbach
University of Queensland
1338 shared
Lauren R. Sorce
Northwestern University
1312 shared
Andrew C. Argent
University of Cape Town
1308 shared

Labs

Pinto LabPI

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Lucas Pinto

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you