Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…

Louis Goldstein

· Professor of Linguistics

University of Southern California · Linguistics

Active 1933–2026

h-index45
Citations9.4k
Papers41650 last 5y
Funding$3.9M
See your match with Louis Goldstein — sign in to PhdFit.Sign in

Research topics

  • Computer Science
  • Artificial Intelligence
  • Speech recognition
  • Psychology
  • Audiology
  • Medicine
  • Pathology
  • Biology
  • Linguistics
  • Acoustics

Selected publications

  • A Long-Form Single-Speaker Real-Time MRI Speech Dataset and Benchmark

    2026-04-21

    articleOpen access

    We release the USC Long Single-Speaker (LSS) dataset containing real-time MRI video of the vocal tract dynamics and simultaneous audio obtained during speech production. This unique dataset contains roughly one hour of video and audio data from a single native speaker of American English, making it one of the longer publicly available single-speaker datasets of real-time MRI speech data. Along with the articulatory and acoustic raw data, we release derived representations of the data that are suitable for a range of downstream tasks. This includes video cropped to the vocal tract region, sentence-level splits of the data, restored and denoised audio, and regions-of-interest timeseries. We also benchmark this dataset on articulatory synthesis and phoneme recognition tasks, providing baseline performance for these tasks on this dataset which future research can aim to improve upon. Dataset website: https://sail.usc.edu/span/single_spk

  • Arti-6: Towards six-dimensional Articulatory Speech Encoding

    2026-04-21

    article

    We propose ARTI-6, a compact six-dimensional articulatory speech encoding framework derived from real-time MRI data that captures crucial vocal tract regions including the velum, tongue root, and larynx. ARTI-6 consists of three components: (1) a six-dimensional articulatory feature set representing key regions of the vocal tract; (2) an articulatory inversion model, which predicts articulatory features from speech acoustics leveraging speech foundation models, achieving a prediction correlation of 0.87; and (3) an articulatory synthesis model, which reconstructs intelligible speech directly from articulatory features, showing that even a low-dimensional representation can generate natural-sounding speech. Together, ARTI-6 provides an interpretable, computationally efficient, and physiologically grounded framework for advancing articulatory inversion, synthesis, and broader speech technology applications. The source code and speech samples are publicly available. <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>

  • Workshop 13 May 2024 : SPEECH PRODUCTION MODELS AND EMPIRICAL EVIDENCE FROM TYPICAL AND PATHOLOGICAL SPEECH

    HAL (Le Centre pour la Communication Scientifique Directe) · 2025-01-01

    datasetOpen access

    This unpublished document can be cited as: Fougeron C., Goldstein L., Guenther F., Lœvenbruck H., Mefferd A., Mücke D., Niziolek C., Parrel B., Perrier P., Ziegler W., Laganaro M. (unpublished manuscript) Transcription of the workshop Speech Production Models and Empirical Evidence from Typical and Pathological Speech. 13 May 2024, Grenoble, France. doi: 10.26037/yareta:cvlb5qujzzc3ti3pgxq62y76ui.

  • The stability of articulatory and acoustic oscillatory signals derived from speech

    JASA Express Letters · 2025-04-01

    articleOpen accessSenior author

    Articulatory underpinnings of periodicities in the speech signal are unclear beyond a general alternation of vocal tract opening and closing. This study evaluates a modulatory articulatory signal that captures instantaneous change in vocal tract posture and its relation with two acoustic oscillatory signals, comparing stabilities to the progression of vowel and stressed vowel onsets. Modulatory signals can be calculated more efficiently than labeling linguistic events. These signals were more stable in periodicity than acoustic vowel onsets and not different from stressed vowel onsets, suggesting that an articulatory modulation function can provide a useful method for indexing foundational periodicities in speech without tedious annotation.

  • 75-Speaker Annot-16: A benchmark dataset for speech articulatory rt-MRI annotation with articulator contours and phonetic alignment

    2025-08-17

    articleOpen access

    &lt;p&gt;High-quality speech articulatory databases are essential for&nbsp;advancing speech science and technology research. However,&nbsp;the lack of standardized annotations limits their full potential use and broad accessibility. In this context, we introduce&nbsp;75-Speaker Annot-16, a comprehensive annotation dataset derived from the 75-Speaker vocal tract MRI database. Annot-16 provides phonetic alignments, articulator contour annotations, and handmade ground-truth articulator contours. Our&nbsp;annotation process integrates automated algorithms with expert verification to ensure accuracy and efficiency. To demonstrate its utility, we establish three benchmark tasks: speech&nbsp;phoneme recognition, articulatory contour segmentation, and&nbsp;articulatory phoneme recognition. Annot-16 can serve as&nbsp;a valuable resource for speech modeling, computer vision,&nbsp;and cross-modal learning, bridging engineering applications,&nbsp;speech science, and linguistic research.&lt;/p&gt;

  • Co-registration of real-time MRI and respiration for speech research

    2025-08-17 · 1 citations

    articleSenior author
  • Towards disentangling the contributions of articulation and acoustics in multimodal phoneme recognition

    ArXiv.org · 2025-05-29

    preprintOpen access

    Although many previous studies have carried out multimodal learning with real-time MRI data that captures the audio-visual kinematics of the vocal tract during speech, these studies have been limited by their reliance on multi-speaker corpora. This prevents such models from learning a detailed relationship between acoustics and articulation due to considerable cross-speaker variability. In this study, we develop unimodal audio and video models as well as multimodal models for phoneme recognition using a long-form single-speaker MRI corpus, with the goal of disentangling and interpreting the contributions of each modality. Audio and multimodal models show similar performance on different phonetic manner classes but diverge on places of articulation. Interpretation of the models' latent space shows similar encoding of the phonetic space across audio and multimodal models, while the models' attention weights highlight differences in acoustic and articulatory timing for certain phonemes.

  • Articulatory Feature Prediction from Surface EMG during Speech Production

    2025-08-17 · 3 citations

    article
  • Articulatory Feature Prediction from Surface EMG during Speech Production

    ArXiv.org · 2025-05-20

    preprintOpen access

    We present a model for predicting articulatory features from surface electromyography (EMG) signals during speech production. The proposed model integrates convolutional layers and a Transformer block, followed by separate predictors for articulatory features. Our approach achieves a high prediction correlation of approximately 0.9 for most articulatory features. Furthermore, we demonstrate that these predicted articulatory features can be decoded into intelligible speech waveforms. To our knowledge, this is the first method to decode speech waveforms from surface EMG via articulatory features, offering a novel approach to EMG-based speech synthesis. Additionally, we analyze the relationship between EMG electrode placement and articulatory feature predictability, providing knowledge-driven insights for optimizing EMG electrode configurations. The source code and decoded speech samples are publicly available.

  • Instantaneous changes in acoustic signals reflect syllable progression and cross-linguistic syllable variation

    2025-08-17

    articleSenior author

Recent grants

Frequent coauthors

  • Shrikanth Narayanan

    109 shared
  • Dani Byrd

    77 shared
  • Elliot Saltzman

    Boston University

    75 shared
  • Hosung Nam

    57 shared
  • Michael Proctor

    43 shared
  • Marianne Pouplier

    Klinikum Saarbrücken

    42 shared
  • Vikram Ramanarayanan

    40 shared
  • Christine Mooshammer

    Humboldt-Universität zu Berlin

    39 shared

Labs

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Louis Goldstein

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup