Richard Futrell

· Associate Professor and Graduate DirectorVerified

University of California, Irvine · Communication

Active 1972–2025

h-index34

Citations4.6k

Papers17874 last 5y

Funding$173k

Faculty page

See your match with Richard Futrell — sign in to PhdFit.Sign in

About

Richard Futrell is an Associate Professor in the UC Irvine Department of Language Science where he leads the Language Processing Group. His research focuses on studying language processing in humans and machines through the application of information theory and Bayesian cognitive modeling. He also works on natural language processing (NLP) and AI interpretability, contributing to understanding the computational principles underlying language comprehension, production, and structure. His work explores how linguistic structures and processing efficiency are shaped by cognitive constraints and information-theoretic principles, often employing computational models to analyze language phenomena across different languages and modalities. Futrell's research has been recognized with awards such as the ACL Best Paper Award and the Best Paper Award for Computational Modeling of Language, reflecting his significant contributions to cognitive science and computational linguistics.

Research topics

Natural Language Processing
Artificial Intelligence
Computer Science
Linguistics
Programming language
Mathematics
Cognitive psychology
Psychology

Selected publications

Back to the Future: The Role of Past and Future Context Predictability in Incremental Language Production
arXiv (Cornell University) · 2025-11-11
preprintOpen accessSenior author
Contextual predictability shapes how we choose and encode words in production. The effects of a word's predictability given preceding or past context are generally well-understood in both production and comprehension, but studies of naturalistic production have also revealed a poorly-understood yet robust backward predictability effect of a word given only its future context, which may be linked to future planning. Across two studies of naturalistic speech, we revisit backward predictability using improved operationalizations, introducing a conceptually motivated information-theoretic measure that quantifies the information shared between a word and future context under the constraints imposed by the past context. Study 1 shows that this measure produces effects qualitatively similar to backward predictability while explaining unique variance in phonetic reduction. Study 2 examines substitution errors within a generative framework that models lexical, contextual, and communicative influences on word choice to predict the identity of the word that surfaces as an error. Within this framework, we find that past-conditioned predictability increases error likelihood, whereas future-conditioned predictability reduces it. Further, our proposed measure emerges as the strongest contextual predictor of error identity, subsuming backward predictability. Analysis of error types further reveals graded trade offs in how speakers prioritize form-, meaning-, and context-based information during lexical planning. Together, these findings illuminate how past and future context shape word choice and encoding, linking contextual predictability to mechanisms of incremental planning in sentence production.
Publisher OA PDF DOI
Informativity enhances memory robustness against interference in sentence comprehension
Journal of Memory and Language · 2025-01-18 · 5 citations
articleOpen accessSenior author
Language comprehension has been argued to be expectation-based, with more predictable linguistic units being easier to process. However, as a communicative tool, language is often used to deliver messages that are novel and informative, suggesting the necessity of some cognitive mechanisms handling less predictable but more informative content. This paper proposes strategic memory allocation as one such mechanism. Although less predictable linguistic units require greater processing effort for memory encoding, recognizing the inconsistency between top-down predictions and bottom-up perceptual input may signal the working memory system to prioritize these units, enhancing the robustness of their representation against interference. We examine this hypothesis through the lens of the agreement attraction effect in two self-paced reading experiments. In Experiment 1, we find that less predictable but more informative target nouns exhibit weaker agreement attraction in online reading times, especially with more fine-grained measures of predictability such as the surprisal from large language models. This weaker agreement attraction effect for less predictable target nouns confirms our hypothesis that informative linguistic units are prioritized and receive more robust memory representation. In Experiment 2, however, no modulation of agreement attraction emerges when we manipulate the predictability of distractor nouns, suggesting the need for a more nuanced characterization of how information is structured and operated in memory. Our findings highlight an interplay of memory, predictive processing, and implicit learning. We also discuss the implications of our result for memory efficiency and memory compression. More broadly, by demonstrating that the limited memory resources are dynamically optimized for the relevant processing task, the current study highlights a connection to the resource-rational analysis of human cognition in general. • Limited working memory resources are strategically allocated based on informativity. • Less predictable information is prioritized for working memory resources. • Less predictable linguistic units are encoded with more robust memory representation. • Less predictable target nouns exhibit weaker agreement attraction effect.
Publisher DOI
SPACER: A Parallel Dataset of Speech Production And Comprehension of Error Repairs
ArXiv.org · 2025-03-20
preprintOpen accessSenior author
Speech errors are a natural part of communication, yet they rarely lead to complete communicative failure because both speakers and comprehenders can detect and correct errors. Although prior research has examined error monitoring and correction in production and comprehension separately, integrated investigation of both systems has been impeded by the scarcity of parallel data. In this study, we present SPACER, a parallel dataset that captures how naturalistic speech errors are corrected by both speakers and comprehenders. We focus on single-word substitution errors extracted from the Switchboard corpus, accompanied by speaker's self-repairs and comprehenders' responses from an offline text-editing experiment. Our exploratory analysis suggests asymmetries in error correction strategies: speakers are more likely to repair errors that introduce greater semantic and phonemic deviations, whereas comprehenders tend to correct errors that are phonemically similar to more plausible alternatives or do not fit into prior contexts. Our dataset enables future research on integrated approaches toward studying language production and comprehension.
Publisher OA PDF DOI
Strategic resource allocation in memory encoding: An efficiency principle shaping language processing
Journal of Memory and Language · 2025-11-04
articleOpen accessSenior author
How is the limited capacity of working memory efficiently used to support human linguistic behaviors? In this paper, we propose Strategic Resource Allocation (SRA) as an efficiency principle for memory encoding in sentence processing. The idea is that working memory resources are dynamically and strategically allocated to prioritize novel and unexpected information. From a resource-rational perspective, we argue that SRA is the principled solution to a computational problem posed by two functional assumptions about working memory, namely its limited capacity and its noisy representation. Specifically, working memory needs to minimize the retrieval error of past inputs under the constraint of limited memory resources, an optimization problem whose solution is to allocate more resources to encode more surprising inputs with higher precision. One of the critical consequences of SRA is that surprising inputs are encoded with enhanced representations, and therefore are less susceptible to memory decay and interference. Empirically, through naturalistic corpus data, we find converging evidence for SRA in the context of dependency locality from both production and comprehension, where non-local dependencies with less predictable antecedents are associated with reduced locality effect. However, our results also reveal considerable cross-linguistic variability, suggesting the need for a closer examination of how SRA, as a domain-general memory efficiency principle, interacts with language-specific phrase structures. SRA highlights the critical role of representational uncertainty in understanding memory encoding. It also provides a reinterpretation for the effects of surprisal and entropy on processing difficulty from the perspective of efficient memory encoding. • WM resources are strategically allocated to prioritize unexpected information. • Theoretically, SRA arises as an efficient solution to a computational problem of WM. • SRA predicts that high-surprisal inputs are encoded with higher precision. • Higher precision in turn suggests more robust representation against interference. • Empirically, we found that high-surprisal inputs exhibit reduced locality effect.
Publisher DOI
Creolization versus code-switching: An agent-based cognitive model for bilingual strategies in language contact
2025-01-01
articleOpen accessSenior author
Creolization and code-switching are closely related contact-induced linguistic phenomena, yet little attention has been paid to the connection between them.In this paper, we propose an agent-based cognitive model which provides a linkage between these two phenomena focusing on the statistical regularization of language use.That is, we identify that creolization as a conventionalization process and code-switching as flexible language choice can be optimal solutions for the same cognitive model in different social environments.Our model postulates a social structure of bilingual and monolingual populations, in which a set of agents seek for optimal communicative strategy shaped by multiple cognitive constraints.The simulation results show that our model successfully captures both phenomena as two ends of a continuum, characterized by varying degrees of regularization in the use of linguistic constructions from multiple source languages.The model also reveals a subtle dynamic between social structure and individual-level cognitive constraints.
Publisher OA PDF DOI
Clarifying orthography: Orthographic transparency as compressibility
ArXiv.org · 2025-05-19 · 1 citations
preprintOpen accessSenior author
Orthographic transparency -- how directly spelling is related to sound -- lacks a unified, script-agnostic metric. Using ideas from algorithmic information theory, we quantify orthographic transparency in terms of the mutual compressibility between orthographic and phonological strings. Our measure provides a principled way to combine two factors that decrease orthographic transparency, capturing both irregular spellings and rule complexity in one quantity. We estimate our transparency measure using prequential code-lengths derived from neural sequence models. Evaluating 22 languages across a broad range of script types (alphabetic, abjad, abugida, syllabic, logographic) confirms common intuitions about relative transparency of scripts. Mutual compressibility offers a simple, principled, and general yardstick for orthographic transparency.
Publisher OA PDF DOI
How Linguistics Learned to Stop Worrying and Love the Language Models
Behavioral and Brain Sciences · 2025-07-24 · 5 citations
preprintOpen access1st authorCorresponding
Language models can produce fluent, grammatical text. Nonetheless, some maintain that language models don't really learn language and also that, even if they did, that would not be informative for the study of human learning and processing. On the other side, there have been claims that the success of LMs obviates the need for studying linguistic theory and structure. We argue that both extremes are wrong. LMs can contribute to fundamental questions about linguistic structure, language processing, and learning. They force us to rethink arguments and ways of thinking that have been foundational in linguistics. While they do not replace linguistic structure and theory, they serve as model systems and working proofs of concept for gradient, usage-based approaches to language. We offer an optimistic take on the relationship between language models and linguistics.
Publisher OA PDF DOI
Linguistic structure from a bottleneck on sequential information processing
Nature Human Behaviour · 2025-11-24 · 2 citations
articleOpen access1st authorCorresponding
Human language has a distinct systematic structure, where utterances break into individually meaningful words that are combined to form phrases. Here we show that natural-language-like systematicity arises in codes that are constrained by a statistical measure of complexity called predictive information, also known as excess entropy. Predictive information is the mutual information between the past and future of a stochastic process. In simulations, we find that codes that minimize predictive information break messages into groups of approximately independent features that are expressed systematically and locally, corresponding to words and phrases. Next, drawing on cross-linguistic text corpora, we find that actual human languages are structured in a way that yields low predictive information compared with baselines at the levels of phonology, morphology, syntax and lexical semantics. Our results establish a link between the statistical and algebraic structure of language and reinforce the idea that these structures are shaped by communication under general cognitive constraints.
Publisher OA PDF DOI
SPACER: A Parallel Dataset of Speech Production And Comprehension of Error Repairs
2025-01-01
articleOpen accessSenior author
Speech errors are a natural part of communication, yet they rarely lead to complete communicative failure because both speakers and comprehenders can detect and correct errors.Although prior research has examined error monitoring and correction in production and comprehension separately, integrated investigation of both systems has been impeded by the scarcity of parallel data.In this study, we present SPACER, a parallel dataset that captures how naturalistic speech errors are corrected by both speakers and comprehenders.We focus on single-word substitution errors extracted from the Switchboard corpus, accompanied by speaker's self-repairs and comprehenders' responses from an offline text-editing experiment.Our exploratory analysis suggests asymmetries in error correction strategies: speakers are more likely to repair errors that introduce greater semantic and phonemic deviations, whereas comprehenders tend to correct errors that are phonemically similar to more plausible alternatives or do not fit into prior contexts.Our dataset 1 enables future research on integrated approaches toward studying language production and comprehension.
Publisher OA PDF DOI
Adaptation to noisy language input in real time: Evidence from ERPs
Underline Science Inc. · 2025-06-18
otherOpen access
Language comprehension often deviates from the literal meaning of the input, particularly when errors resembles more plausible alternatives. Such non-literal interpretations have been associated with a reduced N400 and increased P600, but it remains debated whether these effects reflect perceptual misrepresentation of the input or error correction. One way to tease apart these accounts is to examine how comprehenders adapt to a noisy linguistic environment. A perceptual error account predicts that increased exposure to noise leads to habituation to errors and more misperception, resulting in reduced N400 and P600 responses. In contrast, an error correction account predicts that comprehenders perform more error correction in noisy environments, leading to increased P600s, and potentially modulated N400s depending on the timing of the correction. In this study, we manipulated the proportion of errors in non-critical exposure sentences and measured ERP responses to different types of anomalies. The results replicated prior findings of reduced N400s for recoverable errors. Results in the P600 window were not replicated and it remains an open question which framework (error correction vs. perceptual error) best accounts for the data. Further, the results revealed substantial individual differences in processing words which may contain errors with implications for how participants adapted to additional noise in the environment.
Publisher DOI

Recent grants

CRII: RI: Opening the black box of neural natural language processing models using machine-behavioral methods
NSF · $173k · 2020–2023

Frequent coauthors

Roger Lévy
70 shared
Edward Gibson
68 shared
Evelina Fedorenko
Massachusetts Institute of Technology
51 shared
Kyle Mahowald
30 shared
Michael Hahn
28 shared
Ethan Wilcox
24 shared
Idan Blank
23 shared
Titus von der Malsburg
University of Stuttgart
18 shared

Labs

Language Processing GroupPI

Awards & honors

Winner of Best Paper Award for Computational Modeling of Lan…
Winner of ACL Best Paper Award (2024)
Winner of the Sayan Gul Award for Best Undergraduate Paper (…
Best Paper Award for Computational Modeling of Language (202…

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Richard Futrell

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you