João Sedoc

· Assistant Professor of Technology, Operations, and StatisticsVerified

New York University · Technology, Operations, and Statistics Department

Active 2016–2026

h-index20

Citations1.6k

Papers152108 last 5y

Funding—

Faculty page Lab page Website

See your match with João Sedoc — sign in to PhdFit.Sign in

About

The page provides information about the New York University Stern Center for Research Computing (SCRC), which is devoted to providing world-class computational facilities and services to researchers at the Stern School of Business. These services include a moderately sized Slurm HPC cluster, Cloud Computing (Virtual Machines), data acquisition and storage, research software, and access to WRDS (Wharton Research Data System). The center offers a comprehensive suite of software services designed to facilitate advanced computational research and data analysis, as well as access to datasets from diverse disciplines through collaborations with data repositories, platforms, and academic institutions. Additionally, the center provides a wide range of computing services and resources to support faculty and researchers' projects, along with high-speed, robust, and scalable storage systems to meet diverse computational and storage needs.

Research topics

Computer Science
Natural Language Processing
Artificial Intelligence
Psychology
Linguistics
Medicine
Speech recognition
Nursing
Geology
Internet privacy
World Wide Web

Selected publications

Prompt-Counterfactual Explanations for Generative AI System Behavior
ArXiv.org · 2026-01-06
articleOpen accessSenior author
As generative AI systems become integrated into real-world applications, organizations increasingly need to be able to understand and interpret their behavior. In particular, decision-makers need to understand what causes generative AI systems to exhibit specific output characteristics. Within this general topic, this paper examines a key question: what is it about the input -- the prompt -- that causes an LLM-based generative AI system to produce output that exhibits specific characteristics, such as toxicity, negative sentiment, or political bias. To examine this question, we adapt a common technique from the Explainable AI literature: counterfactual explanations. We explain why traditional counterfactual explanations cannot be applied directly to generative AI systems, due to several differences in how generative AI systems function. We then propose a flexible framework that adapts counterfactual explanations to non-deterministic, generative AI systems in scenarios where downstream classifiers can reveal key characteristics of their outputs. Based on this framework, we introduce an algorithm for generating prompt-counterfactual explanations (PCEs). Finally, we demonstrate the production of counterfactual explanations for generative AI systems with three case studies, examining different output characteristics (viz., political leaning, toxicity, and sentiment). The case studies further show that PCEs can streamline prompt engineering to suppress undesirable output characteristics and can enhance red-teaming efforts to uncover additional prompts that elicit undesirable outputs. Ultimately, this work lays a foundation for prompt-focused interpretability in generative AI: a capability that will become indispensable as these models are entrusted with higher-stakes tasks and subject to emerging regulatory requirements for transparency and accountability.
Publisher OA PDF
Prompt-Counterfactual Explanations for Generative AI System Behavior
Open MIND · 2026-01-06
preprintSenior author
As generative AI systems become integrated into real-world applications, organizations increasingly need to be able to understand and interpret their behavior. In particular, decision-makers need to understand what causes generative AI systems to exhibit specific output characteristics. Within this general topic, this paper examines a key question: what is it about the input -- the prompt -- that causes an LLM-based generative AI system to produce output that exhibits specific characteristics, such as toxicity, negative sentiment, or political bias. To examine this question, we adapt a common technique from the Explainable AI literature: counterfactual explanations. We explain why traditional counterfactual explanations cannot be applied directly to generative AI systems, due to several differences in how generative AI systems function. We then propose a flexible framework that adapts counterfactual explanations to non-deterministic, generative AI systems in scenarios where downstream classifiers can reveal key characteristics of their outputs. Based on this framework, we introduce an algorithm for generating prompt-counterfactual explanations (PCEs). Finally, we demonstrate the production of counterfactual explanations for generative AI systems with three case studies, examining different output characteristics (viz., political leaning, toxicity, and sentiment). The case studies further show that PCEs can streamline prompt engineering to suppress undesirable output characteristics and can enhance red-teaming efforts to uncover additional prompts that elicit undesirable outputs. Ultimately, this work lays a foundation for prompt-focused interpretability in generative AI: a capability that will become indispensable as these models are entrusted with higher-stakes tasks and subject to emerging regulatory requirements for transparency and accountability.
DOI
What can chatbot conversations reveal about vaccine concerns? An observational topic modelling study for public health infoveillance
BMJ Digital Health & AI · 2026-05-01
articleOpen access
Objective The COVID-19 pandemic was marked by a surge of online information, including misinformation about vaccines. Health agencies recommend infoveillance to track public attitudes on immunisation, particularly during health emergencies. We sought to investigate how chatbots may serve as a novel data source for digital monitoring of vaccine concerns. Chatbots are an increasingly popular two-way health communication tool, and an analysis of anonymised chatbot inputs could identify emerging misinformation and public concerns. Methods and analysis We used a topic modeller and large language model (LLM)-based few-shot learner to understand the themes and emotional tone of chats users sent to the Vaccine Information Resource Assistant (VIRA), a non-generative chatbot created by Johns Hopkins to answer questions about COVID-19 vaccines that was used or shared by health departments in 12 US states. We employed BERTopic to conduct topic modelling on user textual data and conversations. We also used OpenAI’s LLM, GPT-4o, employing a human-labelled set of chats and instructions to help fine-tune the model and to classify a random subset of 8760 chats sent to VIRA into pro-vaccine, neutral and anti-vaccine. Results Analysing 30 336 chats users sent to VIRA over a 2-year period, in English and Spanish, we found most focused on vaccine recommendations and safety (71%), with far fewer chats addressing conspiracies and questioning the need for vaccines (3%). Sentiment analysis of a randomly selected subset of chats (n=8760) indicated that English-language chats were more likely than Spanish-language chats to express negative emotions about vaccines (12% vs 4%), but the majority of messages sent by users in either language were classified as neutral. Conclusion This observational analysis offers a detailed case study on how anonymised chatbot dialogues may expand on insights from social media for health agencies and others seeking to monitor opinions and work to support strong vaccine confidence. As chatbots become increasingly fluid conversational tools, the findings suggest significant potential for chatbot-facilitated health interventions, both for public engagement as well as understanding.
Publisher OA PDF DOI
Front Matter
2025-01-01
paratextOpen access
Publisher OA PDF DOI
What If the Prompt Were Different? Counterfactual Explanations for the Characteristics of Generative Outputs
2025-06-12
articleOpen access
As generative AI systems become increasingly integrated into realworld applications, the need to analyze and interpret their outputs grows in importance.This paper addresses the challenge of assessing whether generative outputs exhibit specific characteristics-such as toxicity, a certain sentiment, or bias.We borrow a concept from the traditional Explainable AI literature-counterfactual explanations-but argue that it needs to be significantly rethought.We propose a flexible framework that extends counterfactual explanations to non-deterministic generative AI systems, specifically in scenarios where downstream classifiers can reveal characteristics of their outputs.
Publisher OA PDF DOI
Knowing When Not to Answer: Lightweight KB-Aligned OOD Detection for Safe RAG
ArXiv.org · 2025-08-04 · 1 citations
preprintOpen accessSenior author
Retrieval-Augmented Generation (RAG) systems are increasingly deployed in high-stakes domains, where safety depends not only on how a system answers, but also on whether a query should be answered given a knowledge base (KB). Out-of-domain (OOD) queries can cause dense retrieval to surface weakly related context and lead the generator to produce fluent but unjustified responses. We study lightweight, KB-aligned OOD detection as an always-on gate for RAG systems. Our approach applies PCA to KB embeddings and scores queries in a compact subspace selected either by explained-variance retention (EVR) or by a separability-driven t-test ranking. We evaluate geometric semantic-search rules and lightweight classifiers across 16 domains, including high-stakes COVID-19 and Substance Use KBs, and stress-test robustness using both LLM-generated attacks and an in-the-wild 4chan attack. We find that low-dimensional detectors achieve competitive OOD performance while being faster, cheaper, and more interpretable than prompted LLM-based judges. Finally, human and LLM-based evaluations show that OOD queries primarily degrade the relevance of RAG outputs, showing the need for efficient external OOD detection to maintain safe, in-scope behavior.
Publisher OA PDF DOI
Reasoning and the Trusting Behavior of DeepSeek and GPT: An Experiment Revealing Hidden Fault Lines in Large Language Models
ArXiv.org · 2025-02-18
preprintOpen access
When encountering increasingly frequent performance improvements or cost reductions from a new large language model (LLM), developers of applications leveraging LLMs must decide whether to take advantage of these improvements or stay with older tried-and-tested models. Low perceived switching frictions can lead to choices that do not consider more subtle behavior changes that the transition may induce. Our experiments use a popular game-theoretic behavioral economics model of trust to show stark differences in the trusting behavior of OpenAI's and DeepSeek's models. We highlight a collapse in the economic trust behavior of the o1-mini and o3-mini models as they reconcile profit-maximizing and risk-seeking with future returns from trust, and contrast it with DeepSeek's more sophisticated and profitable trusting behavior that stems from an ability to incorporate deeper concepts like forward planning and theory-of-mind. As LLMs form the basis for high-stakes commercial systems, our results highlight the perils of relying on LLM performance benchmarks that are too narrowly defined and suggest that careful analysis of their hidden fault lines should be part of any organization's AI strategy.
Publisher OA PDF DOI
Generative AI Governance
2025-08-06
book-chapter
Given the significant barriers to creating high-quality foundation models (cost of collection of training data, need for access to immense computing power), a small number of primarily closed-source foundation models are establishing leadership in the generative AI market. Applications based on these foundation models are being deployed by a number of firms across multiple sectors.
Publisher DOI
The Illusion of Empathy: How AI Chatbots Shape Conversation Perception
Proceedings of the AAAI Conference on Artificial Intelligence · 2025-04-11 · 8 citations
articleOpen accessSenior author
As AI chatbots increasingly incorporate empathy, understanding user-centered perceptions of chatbot empathy and its impact on conversation quality remains essential yet under-explored. This study examines how chatbot identity and perceived empathy influence users' overall conversation experience. Analyzing 155 conversations from two datasets, we found that while GPT-based chatbots were rated significantly higher in conversational quality, they were consistently perceived as less empathetic than human conversational partners. Empathy ratings from GPT-4o annotations aligned with user ratings, reinforcing the perception of lower empathy in chatbots compared to humans. Our findings underscore the critical role of perceived empathy in shaping conversation quality, revealing that achieving high-quality human-AI interactions requires more than simply embedding empathetic language; it necessitates addressing the nuanced ways users interpret and experience empathy in conversations with chatbots.
Publisher DOI
Overview of Dialog System Evaluation Track: Dimensionality, Language, Culture and Safety at DSTC 12
ArXiv.org · 2025-09-16
preprintOpen accessSenior author
The rapid advancement of Large Language Models (LLMs) has intensified the need for robust dialogue system evaluation, yet comprehensive assessment remains challenging. Traditional metrics often prove insufficient, and safety considerations are frequently narrowly defined or culturally biased. The DSTC12 Track 1, "Dialog System Evaluation: Dimensionality, Language, Culture and Safety," is part of the ongoing effort to address these critical gaps. The track comprised two subtasks: (1) Dialogue-level, Multi-dimensional Automatic Evaluation Metrics, and (2) Multilingual and Multicultural Safety Detection. For Task 1, focused on 10 dialogue dimensions, a Llama-3-8B baseline achieved the highest average Spearman's correlation (0.1681), indicating substantial room for improvement. In Task 2, while participating teams significantly outperformed a Llama-Guard-3-1B baseline on the multilingual safety subset (top ROC-AUC 0.9648), the baseline proved superior on the cultural subset (0.5126 ROC-AUC), highlighting critical needs in culturally-aware safety. This paper describes the datasets and baselines provided to participants, as well as submission evaluation results for each of the two proposed subtasks.
Publisher OA PDF DOI

Frequent coauthors

Lyle Ungar
California University of Pennsylvania
60 shared
Reno Kriz
21 shared
Chris Callison-Burch
California University of Pennsylvania
21 shared
Seolhwa Lee
15 shared
Alexandra DeLucia
12 shared
Sven Buechel
12 shared
Salvatore Giorgi
University of Pennsylvania
12 shared
Marianna Apidianaki
California University of Pennsylvania
11 shared

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with João Sedoc

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you