Whitney Ringwald

Verified

University of Minnesota · Psychology

Active 2018–2026

h-index13

Citations538

Papers8170 last 5y

Funding—

Faculty page

See your match with Whitney Ringwald — sign in to PhdFit.Sign in

About

Whitney Ringwald is an Assistant Professor of Psychology at the University of Minnesota's College of Liberal Arts. Her research bridges clinical and personality science to understand what people do from day-to-day that allows them to thrive and what causes and maintains their problems. She focuses on individual differences in how people manage in daily life, which are reflected in personality and psychopathology, representing normal and problematic ways of meeting needs. Her conceptualization involves viewing these differences as characteristic transactions between a person and their environment, including typical situations and responses. To study these processes, Ringwald employs ambulatory assessment methods such as self-report surveys and smartphone sensor data, involving intensive, repeated sampling of real-life behavior and contexts. She applies various quantitative methods to capture key dynamics, aiming to define individual differences based on actual daily behaviors. Her goal is to provide actionable targets for personalized clinical interventions, contributing to a deeper understanding of human behavior and mental health.

Research topics

Psychology
Clinical psychology
Developmental psychology
Social psychology
Cognitive psychology

Selected publications

Structure of Current Psychopathology and its Associations with Daily Life Experiences using the HiTOP-SR in a Mixed Clinical/Community Sample
2026-02-26
article
The Hierarchical Taxonomy of Psychopathology (HiTOP) is a dimensional nosological system that addresses key limitations with categorical frameworks, including heterogeneity, boundary, and comorbidity issues. The HiTOP consortium recently developed a new self-report instrument, the HiTOP-Self-Report Measure (HiTOP-SR), designed to operationalize the HiTOP model for use in research and clinical practice. In a set of preregistered analyses with a sample of clinical/community participants (75% female, 81% white), we explored the hierarchical structure of the HiTOP-SR scales using exploratory factor analysis (n = 637) and examined their associations with behaviors and experiences assessed in daily life (n = 531), such as affect, stress, impulsivity, energy, sleep quality, and social interactions. Findings indicate a nine-factor model, closely aligned with the HiTOP’s current structure, best represented the measure. The hierarchical structure of the HiTOP-SR generally converges with the HiTOP model, with several key departures, particularly for historically understudied constructs. Furthermore, the HiTOP-SR facet scales and domains associated with individual differences in daily behavior and experiences as anticipated, highlighting the construct validity and the potential clinical utility of this new measure. Our results have implications not only for the structure, validity, and clinical utility of the HiTOP-SR but also raise broader questions about the underlying nature of psychopathology as represented by the HiTOP.
Publisher DOI
Relaxed Efficient Acquisition of Context and Temporal Features
ArXiv.org · 2026-03-11
articleOpen access
In many biomedical applications, measurements are not freely available at inference time: each laboratory test, imaging modality, or assessment incurs financial cost, time burden, or patient risk. Longitudinal active feature acquisition (LAFA) seeks to optimize predictive performance under such constraints by adaptively selecting measurements over time, yet the problem remains inherently challenging due to temporally coupled decisions (missed early measurements cannot be revisited, and acquisition choices influence all downstream predictions). Moreover, real-world clinical workflows typically begin with an initial onboarding phase, during which relatively stable contextual descriptors (e.g., demographics or baseline characteristics) are collected once and subsequently condition longitudinal decision-making. Despite its practical importance, the efficient selection of onboarding context has not been studied jointly with temporally adaptive acquisition. We therefore propose REACT (Relaxed Efficient Acquisition of Context and Temporal features), an end-to-end differentiable framework that simultaneously optimizes (i) selection of onboarding contextual descriptors and (ii) adaptive feature--time acquisition plans for longitudinal measurements under cost constraints. REACT employs a Gumbel--Sigmoid relaxation with straight-through estimation to enable gradient-based optimization over discrete acquisition masks, allowing direct backpropagation from prediction loss and acquisition cost. Across real-world longitudinal health and behavioral datasets, REACT achieves improved predictive performance at lower acquisition costs compared to existing longitudinal acquisition baselines, demonstrating the benefit of modeling onboarding and temporally coupled acquisition within a unified optimization framework.
Publisher OA PDF
Large language models for depression assessment from brief daily diaries
2026-01-06
article1st authorCorresponding
Importance: Self-report measures have limitations for intensive longitudinal assessment of depression. Scoring depression from brief natural language samples with large language models (LLMs) may be a solution to these limitations, but this method has not yet been rigorously evaluated.Objective: Test convergent, construct, and incremental validity of depression ratings made by LLMs from daily video diaries. Design: Cross-sectional study conducted from 2016-2018, including a baseline clinical interview and 14-21 days of daily self-report surveys and video diaries. Setting: Participants were recruited from a clinical research registry and the community via flyers.Participants: Volunteer sample selected for a range of psychopathology and mental health treatment status (45% currently in outpatient treatment).Main outcomes and measures: Daily depression scores estimated by LLMs from video diary transcripts; daily depression, affect, stress exposure rated by self-reports; depression severity assessed by clinical interview.Results: Among the 108 participants included in the analysis, 53% were female and the mean age was 28 (SD = 6.1). LLM ratings of depression from open-ended narratives converged with self-reports of depression within-person, across days (r = .45; 95% CI, .40-.49) and when averaged across days for an overall depression score (r = .61; 95% CI, .43-.75). Averaged LLM depression ratings also correlated with major depressive disorder symptom severity ascertained by clinical interview (r = .53; 95% CI, .35, .68). Daily depression rated by LLM and self-report had similar profiles of associations with daily functioning variables (profile r = .98 within-person; r = .94 between-person). Multivariable regression results showed LLM depression ratings associated with daily functioning variables (partial βs = |.12| - |.28|) and interview-based depression ratings (partial β = .33; 95% CI, .07-.58) over and above self-reports.Conclusions and relevance: Results support validity of LLMs to measure day-to-day changes in depression and overall levels of depression from brief, open-ended daily audio diaries. These findings establish strong foundation for a low-burden depression assessment approach that satisfies the unique demands of intensive longitudinal monitoring and addresses critical shortcomings of self-report surveys.
Publisher DOI
From Word Sequences to Behavioral Sequences: Adapting Modeling and Evaluation Paradigms for Longitudinal NLP
ArXiv.org · 2026-01-12
articleOpen access
While NLP typically treats documents as independent and unordered samples, in longitudinal studies, this assumption rarely holds: documents are nested within authors and ordered in time, forming person-indexed, time-ordered $\textit{behavioral sequences}$. Here, we demonstrate the need for and propose a longitudinal modeling and evaluation paradigm that consequently updates four parts of the NLP pipeline: (1) evaluation splits aligned to generalization over people ($\textit{cross-sectional}$) and/or time ($\textit{prospective}$); (2) accuracy metrics separating between-person differences from within-person dynamics; (3) sequence inputs to incorporate history by default; and (4) model internals that support different $\textit{coarseness}$ of latent state over histories (pooled summaries, explicit dynamics, or interaction-based models). We demonstrate the issues ensued by traditional pipeline and our proposed improvements on a dataset of 17k daily diary transcripts paired with PTSD symptom severity from 238 participants, finding that traditional document-level evaluation can yield substantially different and sometimes reversed conclusions compared to our ecologically valid modeling and evaluation. We tie our results to a broader discussion motivating a shift from word-sequence evaluation toward $\textit{behavior-sequence}$ paradigms for NLP.
Publisher OA PDF
Large language models for depression assessment from brief daily diaries
PsyArXiv (OSF Preprints) · 2026-01-06
preprint
Importance: Self-report measures have limitations for intensive longitudinal assessment of depression. Scoring depression from brief natural language samples with large language models (LLMs) may be a solution to these limitations, but this method has not yet been rigorously evaluated. Objective: Test convergent, construct, and incremental validity of depression ratings made by LLMs from daily video diaries. Design: Cross-sectional study conducted from 2016-2018, including a baseline clinical interview and 14-21 days of daily self-report surveys and video diaries. Setting: Participants were recruited from a clinical research registry and the community via flyers. Participants: Volunteer sample selected for a range of psychopathology and mental health treatment status (45% currently in outpatient treatment). Main outcomes and measures: Daily depression scores estimated by LLMs from video diary transcripts; daily depression, affect, stress exposure rated by self-reports; depression severity assessed by clinical interview. Results: Among the 108 participants included in the analysis, 53% were female and the mean age was 28 (SD = 6.1). LLM ratings of depression from open-ended narratives converged with self-reports of depression within-person, across days (r = .45; 95% CI, .40-.49) and when averaged across days for an overall depression score (r = .61; 95% CI, .43-.75). Averaged LLM depression ratings also correlated with major depressive disorder symptom severity ascertained by clinical interview (r = .53; 95% CI, .35, .68). Daily depression rated by LLM and self-report had similar profiles of associations with daily functioning variables (profile r = .98 within-person; r = .94 between-person). Multivariable regression results showed LLM depression ratings associated with daily functioning variables (partial βs = |.12| - |.28|) and interview-based depression ratings (partial β = .33; 95% CI, .07-.58) over and above self-reports. Conclusions and relevance: Results support validity of LLMs to measure day-to-day changes in depression and overall levels of depression from brief, open-ended daily audio diaries. These findings establish strong foundation for a low-burden depression assessment approach that satisfies the unique demands of intensive longitudinal monitoring and addresses critical shortcomings of self-report surveys.
Publisher
Assessing Personality Using Zero-Shot Generative AI Scoring of Brief Open-Ended Text
PsyArXiv (OSF Preprints) · 2026-02-02
preprintOpen access
Contemporary personality assessment relies heavily on psychometric scales, which offer efficiency but risk oversimplifying the rich and contextual nature of personality. Recognizing these limitations, this study explores the use of commercially available generative large language models (LLMs), such as ChatGPT, Claude, etc., to assess personality traits from open-ended qualitative narratives. Across two distinct samples and methodologies (spontaneous streams of thought and daily video diaries) we used generative LLMs to score Big-Five personality traits, achieving convergence with self-report measures comparable to or exceeding established benchmarks (e.g., self-other agreement, ecological momentary assessment, bespoke machine-learning models). LLM-generated trait scores also demonstrated predictive validity regarding daily behaviors and mental health outcomes. This LLM-based approach achieved quantitative rigor based on qualitative data and is easily accessible without specialized training. Importantly, our findings also reaffirm the ubiquity of personality expression, in that it is carried in the stream our thoughts and is woven into the fabric of our daily lives. These results encourage broader adoption of generative LLMs for psychological assessment, and—given the new generation of tools—stress the value of idiographic narratives as reliable sources of psychological insight.
Publisher
Affective variability prospectively predicts higher affective well-being, but only when people feel low.
Emotion · 2026-01-08
articleOpen access
₂ at the day and year level. In addition, this association was significantly moderated by initial levels of affective well-being and by neuroticism, although the evidence for the latter was limited. These findings highlight the importance of distinguishing between within-person processes and between-person differences: Experiencing greater affective variability relative to others may indicate a lower level of overall affective well-being. At the same time, experiencing greater affective variability when feeling lower than usual may signal the potential for improvement in one's affective experience. (PsycInfo Database Record (c) 2026 APA, all rights reserved).
Publisher DOI
Structure of current psychopathology and its associations with daily life experiences using the Hierarchical Taxonomy of Psychopathology Self-Report (HiTOP-SR) in a mixed clinical/community sample.
Psychological Assessment · 2026-02-26
articleOpen access
= 531), such as affect, stress, impulsivity, energy, sleep quality, and social interactions. Findings indicate a nine-factor model, closely aligned with the HiTOP's current structure, best represented the measure. The hierarchical structure of the HiTOP-SR generally converges with the HiTOP model, with several key departures, particularly for historically understudied constructs. Furthermore, the HiTOP-SR facet scales and domains associated with individual differences in daily behavior and experiences as anticipated, highlighting the construct validity and the potential clinical utility of this new measure. Our results have implications not only for the structure, validity, and clinical utility of the HiTOP-SR but also raise broader questions about the underlying nature of psychopathology as represented by the HiTOP. (PsycInfo Database Record (c) 2026 APA, all rights reserved).
Publisher DOI
Relaxed Efficient Acquisition of Context and Temporal Features
arXiv (Cornell University) · 2026-03-11
preprintOpen access
In many biomedical applications, measurements are not freely available at inference time: each laboratory test, imaging modality, or assessment incurs financial cost, time burden, or patient risk. Longitudinal active feature acquisition (LAFA) seeks to optimize predictive performance under such constraints by adaptively selecting measurements over time, yet the problem remains inherently challenging due to temporally coupled decisions (missed early measurements cannot be revisited, and acquisition choices influence all downstream predictions). Moreover, real-world clinical workflows typically begin with an initial onboarding phase, during which relatively stable contextual descriptors (e.g., demographics or baseline characteristics) are collected once and subsequently condition longitudinal decision-making. Despite its practical importance, the efficient selection of onboarding context has not been studied jointly with temporally adaptive acquisition. We therefore propose REACT (Relaxed Efficient Acquisition of Context and Temporal features), an end-to-end differentiable framework that simultaneously optimizes (i) selection of onboarding contextual descriptors and (ii) adaptive feature--time acquisition plans for longitudinal measurements under cost constraints. REACT employs a Gumbel--Sigmoid relaxation with straight-through estimation to enable gradient-based optimization over discrete acquisition masks, allowing direct backpropagation from prediction loss and acquisition cost. Across real-world longitudinal health and behavioral datasets, REACT achieves improved predictive performance at lower acquisition costs compared to existing longitudinal acquisition baselines, demonstrating the benefit of modeling onboarding and temporally coupled acquisition within a unified optimization framework.
Publisher DOI
Empathy-RSA Project
OSF Preprints (OSF Preprints) · 2026-04-26
otherSenior author
Publisher

Frequent coauthors

Aidan G.C. Wright
84 shared
Aleksandra Kaurin
University of Wuppertal
21 shared
Mario Wenzel
Johannes Gutenberg University Mainz
17 shared
Paul A. Pilkonis
University of Pittsburgh
12 shared
Elizabeth A. Edershile
11 shared
Stephen B. Manuck
University of Pittsburgh
9 shared
Colin Vize
9 shared
William C. Woods
University of Minnesota
8 shared

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Whitney Ringwald

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup