Kenji Sagae
· ProfessorVerifiedUniversity of California, Davis · Linguistics
Active 2001–2026
About
Kenji Sagae is a Professor and Chair of the Department of Linguistics at the University of California, Davis. He holds additional affiliations with the Computer Science graduate program, the Cognitive Science program, and the Computational Social Science designated emphasis. At UC Davis, he runs the Computational Linguistics Laboratory, where recent projects have focused on topics such as toxicity and bias in language models, computational models of syntax, automatic measurement of child language development, and language technology applications in healthcare. Prior to his current position, he was a co-founder of KITT.AI, a startup acquired by Baidu. From 2008 to 2015, he was a Research Assistant Professor at the USC Computer Science Department and a Research Scientist and Project Leader at the USC Institute for Creative Technologies, where he taught Applied Natural Language Processing and supervised a research group on computational models of natural language structure. Before joining USC, he was a member of Tsujii Laboratory at the University of Tokyo, working on combining discriminative dependency parsing with HPSG and applying syntactic parsing in bioinformatics. Kenji Sagae earned his PhD from Carnegie Mellon University in 2006, where his research involved data-driven syntactic parsing and analysis of the development of syntax in child language, under the guidance of advisors Alon Lavie and Brian MacWhinney.
Research topics
- Computer Science
- Natural Language Processing
- Artificial Intelligence
- Linguistics
- Sociology
- Psychology
- Computer Security
- Archaeology
- World Wide Web
- Mathematics
- History
- Data science
- Philosophy
Selected publications
2026-05-03
articleOpen access<sec> <title>BACKGROUND</title> Social media has become a primary infrastructure for vaccine communication. Personal vaccination disclosures, or first-person announcements of vaccination decisions shared publicly online, represent an underexamined channel through which peer influence may shape vaccine attitudes and behavior. Understanding what individuals disclose when they announce vaccination, and how those disclosures evolve over time, offers actionable insights for designing peer-based communication strategies in future health emergencies. </sec> <sec> <title>OBJECTIVE</title> This study aimed to identify the themes present in COVID-19 vaccination disclosures on Twitter, examine how those themes changed across four phases of the U.S. vaccine rollout, and determine which themes were associated with greater public engagement. </sec> <sec> <title>METHODS</title> Guided by the Disclosure Decision Model (DDM), we analyzed 207,647 vaccination disclosure posts on Twitter from December 14, 2020, to January 1, 2022. Posts were identified using a RoBERTa binary classifier (F1 = 0.85) trained on 6,348 human-coded posts. A mixed deductive-inductive coding approach, grounded in the DDM's motivational goals, yielded 11 disclosure themes (Krippendorff's α = 0.72–0.93). Eleven additional RoBERTa classifiers were trained to label themes across the full dataset. Network analysis examined co-occurrence of themes within each rollout phase. Negative binomial regression assessed the association between disclosure themes and an engagement metric (sum of likes, reposts, and quotes), controlling for follower count, media presence, word count, sentiment, and other post-level factors. </sec> <sec> <title>RESULTS</title> Eleven distinct disclosure themes were identified, including community identity (n=31,258; 15.05%), gratitude (n=28,560; 13.75%), and social benefits (n=27,261; 13.13%). Themes shifted meaningfully across rollout phases: gratitude dominated phase one, community identity rose in phases two and four, and encouraging vaccination became prominent in phase three. Gratitude was associated with a 31% increase in engagement (incidence rate ratio [IRR] = 1.31; 95% CI 1.28-1.34; P<.001), followed by relief (IRR = 1.25; 95% CI 1.20-1.30; P<.001) and community identity (IRR = 1.17; 95% CI 1.14-1.20; P<.001). Political disclosures (IRR = 0.89; 95% CI 0.86-0.93; P<.001) and social benefits disclosures (IRR = 0.88; 95% CI 0.85-0.91; P<.001) were associated with decreased engagement. Each additional theme in a post was associated with a 5.6% increase in engagement (IRR = 1.06; 95% CI 1.05-1.06; P<.001). </sec> <sec> <title>CONCLUSIONS</title> Vaccination disclosures on Twitter were thematically patterned, theoretically interpretable via the DDM, and responsive to changing public health contexts. Identity-based and emotionally resonant disclosures, particularly gratitude and community identity, were most strongly associated with public engagement. These findings provide an empirically grounded framework for designing peer-based vaccine communication strategies that leverage voluntary self-disclosure in future immunization campaigns. </sec>
JMIR AI · 2025-11-21 · 1 citations
articleOpen accessBACKGROUND: Artificial intelligence (AI) chatbots have become prominent tools in health care to enhance health knowledge and promote healthy behaviors across diverse populations. However, factors influencing the perception of AI chatbots and human-AI interaction are largely unknown. OBJECTIVE: This study aimed to identify interaction characteristics associated with the perception of an AI chatbot identity as a human versus an artificial agent, adjusting for sociodemographic status and previous chatbot use in a diverse sample of women. METHODS: This study was a secondary analysis of data from the HeartBot trial in women aged 25 years or older who were recruited through social media from October 2023 to January 2024. The original goal of the HeartBot trial was to evaluate the change in awareness and knowledge of heart attack after interacting with a fully automated AI HeartBot chatbot. All participants interacted with HeartBot once. At the beginning of the conversation, the chatbot introduced itself as HeartBot. However, it did not explicitly indicate that participants would be interacting with an AI system. The perceived chatbot identity (human vs artificial agent), conversation length with HeartBot, message humanness, message effectiveness, and attitude toward AI were measured at the postchatbot survey. Multivariable logistic regression was conducted to explore factors predicting women's perception of a chatbot's identity as a human, adjusting for age, race or ethnicity, education, previous AI chatbot use, message humanness, message effectiveness, and attitude toward AI. RESULTS: Among 92 women (mean age 45.9, SD 11.9; range 26-70 y), the chatbot identity was correctly identified by two-thirds (n=61, 66%) of the sample, while one-third (n=31, 34%) misidentified the chatbot as a human. Over half (n=53, 58%) had previous AI chatbot experience. On average, participants interacted with the HeartBot for 13.0 (SD 7.8) minutes and entered 82.5 (SD 61.9) words. In multivariable analysis, only message humanness was significantly associated with the perception of chatbot identity as a human compared with an artificial agent (adjusted odds ratio 2.37, 95% CI 1.26-4.48; P=.007). CONCLUSIONS: To the best of our knowledge, this is the first study to explicitly ask participants whether they perceive an interaction as human or from a chatbot (HeartBot) in the health care field. This study's findings (role and importance of message humanness) provide new insights into designing chatbots. However, the current evidence remains preliminary. Future research is warranted to understand the relationship between chatbot identity, message humanness, and health outcomes in a larger-scale study.
Revisiting Orthographic Effects in Spoken Word Recognition: Insights from Pretrained Language Models
2025-10-01
articleOpen accessSenior authorThe paired lexical decision task is a common task used for studying online speech processing. Several key priming effects underlying the composition of the mental lexicon have been found using this priming paradigm, non-exhaustively including orthographic priming effects, semantic priming effects, and phonological priming. This study revisits the effects of orthographic priming in an auditory lexical decision task with English heteronymic pairs, word pairs that share the same orthographic form but have distinct phonological codes. Although heteronymic pairs present an ideal condition for orthographic priming effects to surface, heteronyms are often related to each other semantically, making it difficult to isolate possible orthographic effects from semantic priming effects. To this end, we present a novel methodology for using language models to generate semantically matched prime target controls to compare reaction times against. Using these semantically matched controls, we gather reaction time results from a sample of 29 English speaking university student and conduct Bayesian Regression analysis on 153 heteronymic prime target pairs and 343 control pairs. We find no significant difference in reaction times between heteronymic pairs and semantically matched pairs.
What data should I include in my POS tagging training set?
2025-01-01
articleOpen accessJMIR Cardio · 2025-08-31 · 1 citations
articleOpen accessSenior authorBackground: Heart disease remains a leading cause of death for women in the United States, but awareness and knowledge about it are declining. Artificial intelligence (AI) chatbots have great potential to educate women. Objective: This study aimed to evaluate the potential efficacy of HeartBot to increase women's awareness and knowledge of heart attack symptoms and care-seeking behavior. Methods: In this nonrandomized pilot, quasi-experimental study, 92 women aged ≥25 years without a history of heart disease completed the HeartBot interaction via SMS text messaging. The study was remotely conducted from October 2023 to January 2024. HeartBot, a fully automated AI chatbot, covered 15 topics of heart attack awareness, knowledge, symptoms, and care seeking in a single session. The mean length of the HeartBot interaction was 13.0 (SD 7.80) minutes. The primary outcomes consist of four questions: (1) recognizing signs and symptoms of a heart attack, (2) telling the difference between the signs and symptoms of a heart attack, (3) calling an ambulance or dialing 911 when experiencing heart attack symptoms, and (4) getting to an emergency room within 60 minutes after the onset of symptoms of a heart attack. Women were asked to answer the 4 questions before and after the HeartBot interaction on a scale of 1 to 4, with a higher score indicating higher levels of awareness and knowledge of heart attack risks and symptoms. Results: The mean age of the sample was 45.9 (SD 11.9) years. In total, 59.8% (55/92) of the sample identified as belonging to racial or ethnic minority groups. The mean length of the HeartBot interaction was 13.0 (SD 7.80) minutes. In ordinal logistic regression models, women showed significant improvements across the 4 self-reported outcomes (ie, heart attack symptoms and calling 911) even after controlling for potential confounding factors (outcome 1: adjusted odds ratio [aOR] 7.10, 95% CI 3.52-13.16; outcome 2: aOR 5.47, 95% CI 2.77-10.78; outcome 3: aOR 5.75, 95% CI 2.86-11.59; and outcome 4: aOR 2.85, 95% CI 1.54-5.25; P<.001 for all 4 outcomes). Conclusions: HeartBot led to a substantial increase in awareness and knowledge of heart attack risks and symptoms in women. These findings suggest that HeartBot is a promising approach to improving heart health education. A randomized controlled trial of HeartBot is warranted to establish its efficacy and safety for the clinical setting.
Journal of Medical Internet Research · 2025-09-22
articleOpen accessBackground: Artificial intelligence (AI) chatbots, driven by advances in natural language processing, can analyze and generate human language through computational linguistics and machine learning. Despite the rapid development of large language models, little investigation has been conducted to assess whether AI chatbot-delivered educational conversations can achieve a similar level of efficacy as human-delivered conversations. Objective: This study aims to evaluate and explore the potential efficacy of human-delivered conversations versus AI chatbot conversations in increasing women's knowledge and awareness of symptoms and response to a heart attack in the United States. Methods: This is a secondary analysis of 2 datasets collected from the AI Chatbot Development Project. Women aged 25 years or older were recruited through flyers and social media. The first dataset contained conversational data where a research interventionist engaged in educational conversations with participants (human dataset), whereas the second dataset contained conversational data where an AI chatbot named HeartBot engaged in the same educational conversations with participants (HeartBot dataset). Knowledge and awareness of symptoms and response to a heart attack were measured at the pre- and post-interaction with either the human or HeartBot. Perceived message effectiveness and conversational quality were measured at the post-survey. Ordinal logistic regression analyses were conducted to explore factors predicting participants' knowledge, adjusting for age, race or ethnicity, intervention group type, education, word count, message effectiveness, and message humanness. Results: A total of 171 participants (mean age=41.06 y, SD=12.08) in the Human dataset and 92 participants (mean age=45.85 y, SD=11.94) in the HeartBot dataset completed the study. Both human-delivered conversations and HeartBot conversations were associated with significant improvements in participants' ability to recognize heart attack symptoms (adjusted odds ratio [AOR] 15.19, 95% CI 8.46-27.25, P<.001; AOR 7.18, 95% CI 3.59-14.36, P<.001), differentiate between symptoms (AOR 9.44, 95% CI 5.60-15.91, P<.001; AOR 5.44, 95% CI 2.76-10.74, P<.001), call emergency services (AOR 6.87, 95% CI 4.09-11.55, P<.001; AOR 5.74, 95% CI 2.84-11.60, P<.001), and seek emergency care within 60 minutes of symptom onset (AOR 8.68, 95% CI 4.98-15.15, P<.001; AOR 2.86, 95% CI 1.55-5.28, P<.001), even after adjusting for covariates. Comparing the 2 datasets via interaction tests showed a statistically significant improvement in human-delivered conversations versus HeartBot conversation for all but the calling an ambulance question (P=.09). Conclusions: The study's findings provide new insights into the fully automated AI HeartBot, compared to the human-driven text message conversations, and suggest that it has potential in improving women's knowledge and awareness of heart attack symptoms and appropriate response behaviors. Nevertheless, the current evidence remains preliminary. A randomized controlled trial is warranted to validate this study's findings.
2025-02-26
preprintOpen access<sec> <title>BACKGROUND</title> Artificial intelligence (AI) chatbots, driven by advances in natural language processing (NLP), can analyze and generate human language through computational linguistics and machine learning. Despite the rapid development of large language models, little investigation has been conducted to assess whether AI chatbot-delivered educational conversation can achieve a similar level of efficacy as human-delivered conversation. </sec> <sec> <title>OBJECTIVE</title> To evaluate and compare the potential efficacy of human-delivered conversation versus AI chatbot conversation in increasing women’s knowledge and awareness of symptoms and response to heart attack in the United States. </sec> <sec> <title>METHODS</title> This is a secondary analysis of two data sets collected from the AI Chatbot Development Project. Women aged 25 years or older were recruited through flyers and social media. The first dataset contained conversational data where a human researcher engaged in educational conversations with women (Human dataset), whereas the second dataset contained conversational data where an AI chatbot named HeartBot engaged in the same educational conversations with women (HeartBot dataset). Knowledge and awareness of symptoms and response to heart attack were measured at the pre-and post-interaction with either the human or HeartBot. Perceived message effectiveness and conversational quality were measured at the post-survey. Ordinal logistic regression analyses were conducted to explore factors predicting women’s knowledge, adjusting for age, race/ethnicity, interaction group type, education, word count, message effectiveness, and message humanness. </sec> <sec> <title>RESULTS</title> A total of 171 women (mean age=41 years, SD=12.08) in the Human dataset and 104 women (mean age=46 years, SD=11.86) in the HeartBot dataset completed the baseline survey. Both human-delivered conversations and HeartBot conversation significantly improved participants’ ability to recognize heart attack symptoms (AOR 15.19, 95% CI 8.46-27.25, p=0.000000000000000000075; AOR 7.18, 95% CI 3.59-14.36, p=0.000000025), differentiate between symptoms (AOR 9.44, 95% CI 5.60-15.91, p=0.000000000000000034; AOR 5.44, 95% CI 2.76-10.74, p=0.0000011), call emergency services (AOR 6.87, 95% CI 4.09-11.05, p= 0.000000000000035; AOR 5.74, 95% CI 2.84-11.60, p= 0.0000011), and seek emergency care within 60 minutes of symptom onset (AOR 8.68, 95% CI 4.98-15.15, p=0.000000000000027; AOR 2.86, 95% CI 1.55-5.28, p=0.00078) respectively, even after adjusting for covariates. Comparing the two via interaction tests showed a statistically significant improvement of human-delivered conversation vs. HeartBot conversation for all but the calling an ambulance question (p=0.089). </sec> <sec> <title>CONCLUSIONS</title> Our findings indicate that HeartBot holds promise in increasing heart attack knowledge and awareness among women in a cost-effective manner. Future research should employ rigorous experimental designs, such as randomized controlled trials, and evaluate their effectiveness in improving heart health knowledge and subsequent behavior changes. </sec>
2025-01-01
articleOpen accessTo what extent do large language models learn abstract representations as opposed to more superficial aspects of their very large training corpora?We examine this question in the context of binomial ordering preferences involving two conjoined nouns in English.When choosing a binomial ordering (radio and television vs television and radio), humans rely on more than simply the observed frequency of each option.Humans also rely on abstract ordering preferences (e.g., preferences for short words before long words).We investigate whether large language models simply rely on the observed preference in their training data, or whether they are capable of learning the abstract ordering preferences (i.e., abstract representations) that humans rely on.Our results suggest that both smaller and larger models' ordering preferences are driven exclusively by their experience with that item in the training data.Our study provides further insights into differences between how large language models represent and use language and how humans do it, particularly with respect to the use of abstract representations versus observed preferences.
Heart & Lung · 2025-03-01
articleOpen accessBACKGROUND: Heart disease is the leading cause of death (LCOD) for women in the United States. However, despite decades of public health campaigns, awareness of heart disease among women, especially those with racial/ethnic minority backgrounds and young women, significantly declined from 2009 to 2019. OBJECTIVES: The aim of this study was to compare the differences in heart disease awareness as the LCOD among Black, Hispanic, White, and Asian/Other women groups. METHODS: In this cross-sectional, online survey study, 422 community-dwelling women were analyzed. Heart disease as the LCOD was categorized as the correct answer. We implemented log-linear models via a Poisson regression to estimate unadjusted and adjusted relative risks [RRs] of race in predicting correct knowledge of LCOD. RESULTS: The mean age was 41.2 (±12.9) years. The sample represents 39.8 % Hispanic, 28.4 % White, 19.9 % Black, 11.9 % Asian/others. After adjusting for age and cardiovascular disease risks, Black and Hispanic women, as compared to White women, had significantly lower awareness of heart disease as the LCOD [(Adjusted RR=0.69, 95 % CI: 0.52, 0.92); (Adjusted RR= 0.78, 95 % CI: 0.78 -0.94), respectively]. Additionally, physical inactivity and hypertension medication intake were significantly associated with this level of awareness (P < 0.5). CONCLUSION: Lower heart disease awareness in Black and Hispanic women persists. It is crucial to develop more effective approaches to close this disparity. Testing new methods, such as applying artificial intelligence to send more culturally appropriate and personalized messages, is urgently needed to raise women's awareness of their heart disease risk.
AI HeartBot to Increase Women's Awareness and Knowledge of Heart Attack: A Pilot Study (Preprint)
2025-07-09
preprintSenior author<sec> <title>BACKGROUND</title> Heart disease remains a leading cause of death for women in the United States, yet awareness and knowledge are declining. Artificial intelligence (AI) chatbots have great potential to educate women. </sec> <sec> <title>OBJECTIVE</title> To evaluate the potential efficacy of HeartBot to increase women’s awareness and knowledge of heart attack symptoms and care-seeking behavior. </sec> <sec> <title>METHODS</title> In this pilot, quasi-experimental study, 92 women aged 25 years or older without history of heart disease completed the HeartBot interaction via Short Message Service. The study was remotely conducted from October 2023 to January 2024. HeartBot, a fully automated AI chatbot, covered 15 topics of heart attack awareness, knowledge, symptoms, and care-seeking in a single session. The mean (SD) length of the HeartBot interaction was 13.0 (7.80) minutes. The primary outcomes consist of 4 questions on (1) recognizing signs and symptoms of a heart attack, (2) telling the difference between the signs and symptoms of a heart attack, (3) calling an ambulance or dialing 911 when experiencing heart attack symptoms, and (4) getting to an emergency room within 60 minutes after the onset of symptoms of a heart attack. Women were asked to answer the 4 questions before and after the HeartBot interaction on a scale of 1-4, with higher score indicating higher levels of awareness and knowledge of heart attack risks and symptoms. </sec> <sec> <title>RESULTS</title> The sample mean (SD) age was 45.9 (11.9) years. 55 (59.8%) of the sample represented racial/ethnic minorities. The mean (SD) length of the HeartBot interaction was 13.0 (7.80) minutes. In ordinal logistic regression models, women significantly increased in the 4 self-reported outcomes (i.e., heart attack symptoms, calling 911) even after controlling for potential confounding factors (adjusted odds ratio (AOR) = 7.10, 95% CI: 3.52-13.16 for outcome 1; AOR=5.47, 95% CI: 2.77-10.78 for outcome 2; AOR=5.75, 95% CI: 2.86-11.59 for outcome 3; AOR=2.85, 95% CI: 1.54-5.25 for outcome 4; p < 0.001 for all 4 outcomes). </sec> <sec> <title>CONCLUSIONS</title> HeartBot led to a significant increase in awareness and knowledge of heart attack risks and symptoms in women. These findings suggest that HeartBot is a promising approach to improve heart health education. A randomized controlled trial of HeartBot is warranted to establish its efficacy and safety for the clinical setting. </sec>
Frequent coauthors
- 49 shared
David Traum
- 39 shared
David DeVault
- 27 shared
Jun’ichi Tsujii
National Institute of Advanced Industrial Science and Technology
- 26 shared
Andrew S. Gordon
University of Southern California
- 21 shared
Eric Forbell
Creative Technologies (United States)
- 18 shared
Anton Leuski
University of Southern California
- 17 shared
Yusuke Miyao
- 17 shared
Fabrizio Morbini
Awards & honors
- UC Office of the President Research Grant to UC-wide Team -…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Kenji Sagae
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup