Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…

John T. Hale

· Professor

Johns Hopkins University · Neuroscience

Active 1930–2025

h-index25
Citations4.4k
Papers9232 last 5y
Funding$1.4M
See your match with John T. Hale — sign in to PhdFit.Sign in

About

John T. Hale is a Professor in the Department of Cognitive Science at Johns Hopkins University. He received his PhD in Cognitive Science from Johns Hopkins University in 2003. His research uses computational modeling and neuroimaging to study human language processing. He has previously taught at Michigan State University, Cornell University, and the University of Georgia, and was a full-time research scientist at Google DeepMind from 2017 to 2018. His research interests include computational linguistics, corpus methods, psycholinguistics, and neurolinguistics. Professor Hale has contributed to the understanding of language processing through his work published in the journal Cognitive Science and the Annual Review of Linguistics.

Research topics

  • Natural Language Processing
  • Computer Science
  • Artificial Intelligence
  • Psychology
  • Machine Learning
  • Programming language
  • Cognitive science
  • Linguistics
  • Speech recognition

Selected publications

  • Evaluating the timecourses of morpho-orthographic, lexical, and grammatical processing following rapid parallel visual presentation: An EEG investigation in English

    Cognition · 2025-02-07 · 8 citations

    article
  • Simulated Hearing Loss on Speech Recognition, Flight Performance, and Workload in Aviators

    Aerospace Medicine and Human Performance · 2025-04-01

    article

    INTRODUCTION: Hearing loss can compromise U.S. Army aviators' performance, safety, and situational awareness, resulting in increasing mental workload and listening effort. This study evaluated simulated hearing loss on performance and cognitive workload among Army aviators. METHODS: A mixed-effects linear regression study design was used. A total of 21 aviators underwent clinical audiological testing and simulated flight performance assessments. Simulated hearing loss and workload were manipulated to investigate their effects on speech recognition, flight performance, and subjective workload. Flight simulator routes included normal hearing and simulated hearing loss conditions for both high and low workloads. Task load questionnaires were administered for subjective workload assessments and compared across conditions. RESULTS: Speech recognition scores decreased with increasing levels of hearing loss. In-flight speech intelligibility declined in high workload conditions, with a 26% decrease for mild hearing loss and a 40% decrease for severe hearing loss. High workload conditions degraded flight performance and response times to a secondary task which was exacerbated by simulated hearing loss. Workload scores validated increased workload with simulated hearing loss. No significant findings were observed on the hearing assessment. DISCUSSION: Findings suggest hearing loss negatively impacts speech recognition and flight performance, especially under high workloads. These results support the importance of addressing hearing loss in aviators. Further research is needed to determine if the clinically adapted Modified Rhyme Test can reflect the impact of hearing loss on aviator performance. Noetzel J, Henry P, Mackie R, Cave K, Stefanson JR, Hale JK, Andres K, Jones H. Simulated hearing loss on speech recognition, flight performance, and workload in aviators. Aerosp Med Hum Perform. 2025; 96(4):269-278.

  • Hierarchical syntactic structure in human-like language models

    2024-01-01 · 1 citations

    articleOpen accessSenior author

    Language models (LMs) are a meeting point for cognitive modeling and computational linguistics.How should they be designed to serve as adequate cognitive models?To address this question, this study contrasts two Transformerbased LMs that share the same architecture.Only one of them analyzes sentences in terms of explicit hierarchical structure.Evaluating the two LMs against fMRI time series via the surprisal complexity metric, the results implicate the superior temporal gyrus.and This underlines the need for hierarchical sentence structure in word-by-word models of human language comprehension.

  • Le Petit Prince Hong Kong (LPPHK): Naturalistic fMRI and EEG data from older Cantonese speakers

    Scientific Data · 2024-09-11 · 9 citations

    articleOpen access

    Currently, the field of neurobiology of language is based on data from only a few Indo-European languages. The majority of this data comes from younger adults neglecting other age groups. Here we present a multimodal database which consists of task-based and resting state fMRI, structural MRI, and EEG data while participants over 65 years old listened to sections of the story The Little Prince in Cantonese. We also provide data on participants' language history, lifetime experiences, linguistic and cognitive skills. Audio and text annotations, including time-aligned speech segmentation and prosodic information, as well as word-by-word predictors such as frequency and part-of-speech tagging derived from natural language processing (NLP) tools are included in this database. Both MRI and EEG data diagnostics revealed that the data has good quality. This multimodal database could advance our understanding of spatiotemporal dynamics of language comprehension in the older population and help us study the effects of healthy aging on the relationship between brain and behaviour.

  • Multipath parsing in the brain

    arXiv (Cornell University) · 2024-01-31 · 1 citations

    preprintOpen accessSenior author

    Humans understand sentences word-by-word, in the order that they hear them. This incrementality entails resolving temporary ambiguities about syntactic relationships. We investigate how humans process these syntactic ambiguities by correlating predictions from incremental generative dependency parsers with timecourse data from people undergoing functional neuroimaging while listening to an audiobook. In particular, we compare competing hypotheses regarding the number of developing syntactic analyses in play during word-by-word comprehension: one vs more than one. This comparison involves evaluating syntactic surprisal from a state-of-the-art dependency parser with LLM-adapted encodings against an existing fMRI dataset. In both English and Chinese data, we find evidence for multipath parsing. Brain regions associated with this multipath effect include bilateral superior temporal gyrus.

  • Evaluating the timecourses of morpho-orthographic, lexical, and grammatical processing following rapid parallel visual presentation: an EEG investigation in English

    bioRxiv (Cold Spring Harbor Laboratory) · 2024-04-11 · 4 citations

    preprintOpen access

    Abstract Theories of language processing – and typical experimental methodologies – emphasize the word-by-word processing of sentences. This paradigm is good for approximating speech or careful text reading, but arguably, not for the common, cursory glances used while reading short sentences (e.g., cellphone notifications, social media posts). How can we interpret a sentence in a single glance? In an electroencephalography (EEG) study, brain responses to grammatical sentences ( the dogs chase a ball ) presented for 200ms diverged from non-lexical consonant strings ( thj rjxb zkhtb w lhct ) ∼160ms post-sentence onset and from scrambled constructions ( a dogs chase ball the ) ∼250ms post-sentence onset, demonstrating – at different time points – rapid recognition and cursory analysis of linguistic stimuli. In the grammatical sentences, unigram probability correlated with EEG data ∼150–300ms post-sentence onset, and probability of the word given its context estimated by BERT correlated with EEG data after ∼700–800ms. EEG responses did not diverge between grammatical sentences and their counterparts with ungrammatical agreement ( the dogs chases a ball ), although EEG responses did diverge for plural vs. singular morphology at ∼200ms. These results suggest that ‘at-a-glance’ reading is possible, based on coactivation of individual lexical items, morphological structures, and constituent structure at ∼200-300ms, but that words are not integrated into a coherent syntactic/semantic analysis, as evidenced by the substantially later responses to BERT probability and the absence of sensitivity to agreement errors.

  • Le Petit Prince Hong Kong (LPPHK): Naturalistic fMRI and EEG data from older Cantonese speakers

    bioRxiv (Cold Spring Harbor Laboratory) · 2024-04-28

    preprintOpen access

    Abstract Currently, the field of neurobiology of language is based on data from only a few Indo-European languages. The majority of this data comes from younger adults neglecting other age groups. Here we present a multimodal database which consists of task-based and resting state fMRI, structural MRI, and EEG data while participants over 65 years old listened to sections of the story The Little Prince in Cantonese. We also provide data on participants’ language history, lifetime experiences, linguistic and cognitive skills. Audio and text annotations, including time- aligned speech segmentation and prosodic information, as well as word-by-word predictors such as frequency and part-of-speech tagging derived from natural language processing (NLP) tools are included in this database. Both MRI and EEG data diagnostics revealed that the data has good quality. This multimodal database could advance our understanding of spatiotemporal dynamics of language comprehension in the older population and help us study the effects of healthy aging on the relationship between brain and behaviour.

  • Do LLMs learn a true syntactic universal?

    2024-01-01 · 1 citations

    articleOpen access1st authorCorresponding

    Do large multilingual language models learn language universals?We consider a much discussed candidate universal, the Final-over-Final Condition (Sheehan et al., 2017b).This Condition is syntactic in the sense that it can only be stated by reference to abstract sentence properties such as nested phrases and head direction.A study of typologically diverse "mixed head direction" languages confirms that the Condition holds in corpora.But in a targeted syntactic evaluation, Gemini Pro only seems to respect the Condition in German, Russian, Hungarian and Serbian.These relatively high-resource languages contrast with Basque, where Gemini Pro does not seem to have learned the Condition at all.This result suggests that modern language models may need additional sources of bias in order to become truly human-like, within a developmentallyrealistic budget of training data.

  • Neural Correlates of Object-Extracted Relative Clause Processing Across English and Chinese

    Neurobiology of Language · 2023-01-01 · 2 citations

    articleOpen accessSenior author

    Abstract Are the brain bases of language comprehension the same across all human languages, or do these bases vary in a way that corresponds to differences in linguistic typology? English and Mandarin Chinese attest such a typological difference in the domain of relative clauses. Using functional magnetic resonance imaging with English and Chinese participants, who listened to the same translation-equivalent story, we analyzed neuroimages time aligned to object-extracted relative clauses in both languages. In a general linear model analysis of these naturalistic data, comprehension was selectively associated with increased hemodynamic activity in left posterior temporal lobe, angular gyrus, inferior frontal gyrus, precuneus, and posterior cingulate cortex in both languages. This result suggests the processing of object-extracted relative clauses is subserved by a common collection of brain regions, regardless of typology. However, there were also regions that were activated uniquely in our Chinese participants albeit not to a significantly greater degree. These were in the temporal lobe. These Chinese-specific results could reflect structural ambiguity-resolution work that must be done in Chinese but not English object-extracted relative clauses.

  • Modeling Structure‐Building in the Brain With CCG Parsing and Large Language Models

    Cognitive Science · 2023-07-01 · 32 citations

    articleOpen accessSenior author

    To model behavioral and neural correlates of language comprehension in naturalistic environments, researchers have turned to broad-coverage tools from natural-language processing and machine learning. Where syntactic structure is explicitly modeled, prior work has relied predominantly on context-free grammars (CFGs), yet such formalisms are not sufficiently expressive for human languages. Combinatory categorial grammars (CCGs) are sufficiently expressive directly compositional models of grammar with flexible constituency that affords incremental interpretation. In this work, we evaluate whether a more expressive CCG provides a better model than a CFG for human neural signals collected with functional magnetic resonance imaging (fMRI) while participants listen to an audiobook story. We further test between variants of CCG that differ in how they handle optional adjuncts. These evaluations are carried out against a baseline that includes estimates of next-word predictability from a transformer neural network language model. Such a comparison reveals unique contributions of CCG structure-building predominantly in the left posterior temporal lobe: CCG-derived measures offer a superior fit to neural signals compared to those derived from a CFG. These effects are spatially distinct from bilateral superior temporal effects that are unique to predictability. Neural effects for structure-building are thus separable from predictability during naturalistic listening, and those effects are best characterized by a grammar whose expressive power is motivated on independent linguistic grounds.

Recent grants

Frequent coauthors

  • Christophe Pallier

    Cognitive Neuroimaging Lab

    46 shared
  • Jonathan Brennan

    University of Michigan–Ann Arbor

    33 shared
  • Matthew J. Nelson

    University of Alabama at Birmingham

    27 shared
  • Shohini Bhattasali

    University of Toronto

    27 shared
  • Jixing Li

    20 shared
  • Miloš Stanojević

    18 shared
  • Donald Dunagan

    University of Georgia

    16 shared
  • Maximin Coavoux

    Laboratoire d'Informatique de Grenoble

    10 shared

Labs

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with John T. Hale

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup