Rebecca Passonneau
VerifiedPennsylvania State University · Social Data Analytics
Active 1971–2026
About
Rebecca Passonneau is a Professor of Computer Science & Engineering and a Graduate Faculty member in Social Data Analytics at Pennsylvania State University. She is also a C-SoDA Faculty Affiliate. Her research focuses on social data analytics, and she is involved in the interdisciplinary study of social data within the Department of Computer Science & Engineering and the Social Data Analytics program. Her work contributes to understanding and analyzing social data through computational methods, supporting the broader goals of the Center for Social Data Analytics at Penn State.
Research topics
- Computer science
- Artificial intelligence
- Natural language processing
- Information retrieval
- Linguistics
Selected publications
Robust Persona-Aware Toxicity Detection with Prompt Optimization and Learned Ensembling
ArXiv.org · 2026-01-05
articleOpen accessToxicity detection is inherently subjective, shaped by the diverse perspectives and social priors of different demographic groups. While ``pluralistic'' modeling as used in economics and the social sciences aims to capture perspective differences across contexts, current Large Language Model (LLM) prompting techniques have different results across different personas and base models. In this work, we conduct a systematic evaluation of persona-aware toxicity detection, showing that no single prompting method, including our proposed automated prompt optimization strategy, uniformly dominates across all model-persona pairs. To exploit complementary errors, we explore ensembling four prompting variants and propose a lightweight meta-ensemble: an SVM over the 4-bit vector of prompt predictions. Our results demonstrate that the proposed SVM ensemble consistently outperforms individual prompting methods and traditional majority-voting techniques, achieving the strongest overall performance across diverse personas. This work provides one of the first systematic comparisons of persona-conditioned prompting for toxicity detection and offers a robust method for pluralistic evaluation in subjective NLP tasks.
Robust Persona-Aware Toxicity Detection with Prompt Optimization and Learned Ensembling
arXiv (Cornell University) · 2026-01-05
preprintOpen accessToxicity detection is inherently subjective, shaped by the diverse perspectives and social priors of different demographic groups. While ``pluralistic'' modeling as used in economics and the social sciences aims to capture perspective differences across contexts, current Large Language Model (LLM) prompting techniques have different results across different personas and base models. In this work, we conduct a systematic evaluation of persona-aware toxicity detection, showing that no single prompting method, including our proposed automated prompt optimization strategy, uniformly dominates across all model-persona pairs. To exploit complementary errors, we explore ensembling four prompting variants and propose a lightweight meta-ensemble: an SVM over the 4-bit vector of prompt predictions. Our results demonstrate that the proposed SVM ensemble consistently outperforms individual prompting methods and traditional majority-voting techniques, achieving the strongest overall performance across diverse personas. This work provides one of the first systematic comparisons of persona-conditioned prompting for toxicity detection and offers a robust method for pluralistic evaluation in subjective NLP tasks.
Do Methods to Jailbreak and Defend LLMs Generalize Across Languages?
ArXiv.org · 2025-11-01
preprintOpen accessLarge language models (LLMs) undergo safety alignment after training and tuning, yet recent work shows that safety can be bypassed through jailbreak attacks. While many jailbreaks and defenses exist, their cross-lingual generalization remains underexplored. This paper presents the first systematic multilingual evaluation of jailbreaks and defenses across ten languages -- spanning high-, medium-, and low-resource languages -- using six LLMs on HarmBench and AdvBench. We assess two jailbreak types: logical-expression-based and adversarial-prompt-based. For both types, attack success and defense robustness vary across languages: high-resource languages are safer under standard queries but more vulnerable to adversarial ones. Simple defenses can be effective, but are language- and model-dependent. These findings call for language-aware and cross-lingual safety benchmarks for LLMs.
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
2025-01-01
articleOpen accessVipul Gupta, Candace Ross, David Pantoja, Rebecca J. Passonneau, Megan Ung, Adina Williams. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025.
The Role of Teacher Framing in Shaping Student Agency in Human-AI Partnered Science Classrooms
Proceedings. · 2025-06-10
articleOpen accessThis study is part of a larger research project aimed at developing and implementing an NLP-enabled AI feedback tool called PyrEval to support middle school students' science explanation writing.We explored how human-AI integrated classrooms can invite students to harness AI tools while still being agentic learners.Building on theory of new materialism with posthumanist perspectives, we examined teacher framing to see how the nature of PyrEval was communicated, thereby orienting students to partner with or rely on PyrEval.We analyzed one teacher's talk in multiple classrooms as well as that of students in small groups.We found student agency was fostered through teacher framing of (a) PyrEval as a non-neutral actor and a co-investigator and (b) students' participation as an author and their understanding of the nature of PyrEval as core task and purpose.Findings and implications are discussed. Research questionsOur overall inquiry was to explore if and how human-AI integrated classrooms invited students to harness AI tools while still being agentic learners.Ultimately, we are interested in how human-AI partnered classrooms can be designed to help students see themselves as agents with voice and choices.Our research questions were:1. What meta-communicative signals does the teacher use to frame the nature of AI and students' engagement with AI, and why? 2. How does the framed nature of AI relate to students' sensemaking of AI and agency in their activities?
Concept-based Rubrics Improve LLM Formative Assessment and Data Synthesis
ArXiv.org · 2025-04-04
preprintOpen accessSenior authorFormative assessment in STEM topics aims to promote student learning by identifying students' current understanding, thus targeting how to promote further learning. Previous studies suggest that the assessment performance of current generative large language models (LLMs) on constructed responses to open-ended questions is significantly lower than that of supervised classifiers trained on high-quality labeled data. However, we demonstrate that concept-based rubrics can significantly enhance LLM performance, which narrows the gap between LLMs as off-the shelf assessment tools, and smaller supervised models, which need large amounts of training data. For datasets where concept-based rubrics allow LLMs to achieve strong performance, we show that the concept-based rubrics help the same LLMs generate high quality synthetic data for training lightweight, high-performance supervised models. Our experiments span diverse STEM student response datasets with labels of varying quality, including a new real-world dataset that contains some AI-assisted responses, which introduces additional considerations.
Something Just Like TRuST : Toxicity Recognition of Span and Target
ArXiv.org · 2025-06-02
preprintOpen accessSenior authorToxic language includes content that is offensive, abusive, or that promotes harm. Progress in preventing toxic output from large language models (LLMs) is hampered by inconsistent definitions of toxicity. We introduce TRuST, a large-scale dataset that unifies and expands prior resources through a carefully synthesized definition of toxicity, and corresponding annotation scheme. It consists of ~300k annotations, with high-quality human annotation on ~11k. To ensure high-quality, we designed a rigorous, multi-stage human annotation process, and evaluated the diversity of the annotators. Then we benchmarked state-of-the-art LLMs and pre-trained models on three tasks: toxicity detection, identification of the target group, and of toxic words. Our results indicate that fine-tuned PLMs outperform LLMs on the three tasks, and that current reasoning models do not reliably improve performance. TRuST constitutes one of the most comprehensive resources for evaluating and mitigating LLM toxicity, and other research in socially-aware and safer language technologies.
British Journal of Educational Technology · 2025-05-24 · 3 citations
articleAbstract As use of artificial intelligence (AI) has increased, concerns about AI bias and discrimination have been growing. This paper discusses an application called PyrEval in which natural language processing (NLP) was used to automate assessment and provide feedback on middle school science writing without linguistic discrimination. Linguistic discrimination in this study was operationalized as unfair assessment of scientific essays based on writing features that are not considered normative such as subject‐verb disagreement. Such unfair assessment is especially problematic when the purpose of assessment is not assessing English writing but rather assessing the content of scientific explanations. PyrEval was implemented in middle school science classrooms. Students explained their roller coaster design by stating relationships among such science concepts as potential energy, kinetic energy and law of conservation of energy. Initial and revised versions of scientific essays written by 307 eighth‐grade students were analyzed. Our manual and NLP assessment comparison analysis showed that PyrEval did not penalize student essays that contained non‐normative writing features. Repeated measures ANOVAs and GLMM analysis results revealed that essay quality significantly improved from initial to revised essays after receiving the NLP feedback, regardless of non‐normative writing features. Findings and implications are discussed. Practitioner notes What is already known about this topic Advancement in AI has created a variety of opportunities in education, including automated assessment, but AI is not bias‐free. Automated writing assessment designed to improve students' scientific explanations has been studied. While limited, some studies reported biased performance of automated writing assessment tools, but without looking into actual linguistic features about which the tools may have discriminated. What this paper adds This study conducted an actual examination of non‐normative linguistic features in essays written by middle school students to uncover how our NLP tool called PyrEval worked to assess them. PyrEval did not penalize essays containing non‐normative linguistic features. Regardless of non‐normative linguistic features, students' essay quality scores significantly improved from initial to revised essays after receiving feedback from PyrEval. Essay quality improvement was observed regardless of students' prior knowledge, school district and teacher variables. Implications for practice and/or policy This paper inspires practitioners to attend to linguistic discrimination (re)produced by AI. This paper offers possibilities of using PyrEval as a reflection tool, to which human assessors compare their assessment and discover implicit bias against non‐normative linguistic features. PyrEval is available for use on github.com/psunlpgroup/PyrEvalv2 .
Can LLMs Rank the Harmfulness of Smaller LLMs? We are Not There Yet
ArXiv.org · 2025-02-07
preprintOpen accessSenior authorLarge language models (LLMs) have become ubiquitous, thus it is important to understand their risks and limitations. Smaller LLMs can be deployed where compute resources are constrained, such as edge devices, but with different propensity to generate harmful output. Mitigation of LLM harm typically depends on annotating the harmfulness of LLM output, which is expensive to collect from humans. This work studies two questions: How do smaller LLMs rank regarding generation of harmful content? How well can larger LLMs annotate harmfulness? We prompt three small LLMs to elicit harmful content of various types, such as discriminatory language, offensive content, privacy invasion, or negative influence, and collect human rankings of their outputs. Then, we evaluate three state-of-the-art large LLMs on their ability to annotate the harmfulness of these responses. We find that the smaller models differ with respect to harmfulness. We also find that large LLMs show low to moderate agreement with humans. These findings underline the need for further work on harm mitigation in LLMs.
Factors Influencing Students' Perceptions of Automated Feedback and Their Impact on Revision
Proceedings. · 2025-06-10 · 1 citations
articleOpen accessAutomated feedback can provide students with timely information about their writing, but students' willingness to engage meaningfully with the feedback to revise their writing may be influenced by their perceptions of its usefulness.We explored the factors that may have influenced 339, 8th-grade students' perceptions of receiving automated feedback on their writing and whether their perceptions impacted their revisions and writing improvement.Using HLM and logistic regression analyses, we found that: 1) students with more positive perceptions of the automated feedback made revisions that resulted in significant improvements in their writing, and 2) students who received feedback indicating they included more important ideas in their essays had significantly higher perceptions of the usefulness of the feedback, but were significantly less likely to engage in substantive revisions.Implications and the importance of helping students evaluate and reflect on the feedback to make substantive revisions, no matter their initial feedback, are discussed.
Recent grants
NSF · $23k · 2016–2018
RUI: CRI: CI-ADDO-EN: Collaborative Research: MASC: A Community Resource For and By the People
NSF · $86k · 2011–2014
EAGER: Collaborative Research: Automated Instruction Assistant for Argumentative Essays
NSF · $173k · 2018–2021
Frequent coauthors
- 25 shared
Susan L. Epstein
City University of New York
- 22 shared
Patricia Davies
Michigan State University
- 21 shared
Smaranda Muresan
- 18 shared
Yanjun Goa
Prince Mohammad bin Fahd University
- 17 shared
Boyi Xie
- 15 shared
Tiziana Ligorio
City University of New York
- 14 shared
Kathleen McKeown
- 12 shared
Axinia Radeva
Columbia University
Labs
Social Data AnalyticsPI
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Rebecca Passonneau
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup