Mark Hasegawa-Johnson

· Professor, Electrical and Computer Engineering

University of Illinois Urbana-Champaign · Computer Science

Active 2000–2024

h-index6

Citations81

Papers123 last 5y

Funding—

Faculty page

OpenAlex

See your match with Mark Hasegawa-Johnson — sign in to PhdFit.Sign in

About

Mark Hasegawa-Johnson has been on the faculty at the University of Illinois since 1999, where he is currently a Professor of Electrical and Computer Engineering. He received his Ph.D. in 1996 at MIT, with a thesis titled "Formant and Burst Spectral Measures with Quantitative Error Models for Speech Sound Classification," and was a post-doctoral researcher at UCLA from 1996 to 1999. His research is focused on automatic speech recognition, with an emphasis on the mathematization of linguistic concepts. His group has developed mathematical models of linguistic concepts such as a rudimentary model of pre-conscious speech perception, models interpreting pronunciation variability through tracking speech movements, and models utilizing prosody to disambiguate sentences. His recent application successes include speech recognition for talkers with cerebral palsy, retrieval of broadcast television segments in multiple languages based on phonetic queries, automatic detection and labeling of non-speech audio events, and testing of software and methods for teaching Chinese in Mandarin language classrooms. Prof. Hasegawa-Johnson has published over 450 peer-reviewed articles, patents, and conference papers in areas including machine learning models of articulatory and acoustic phonetics, prosody, dysarthria, non-speech acoustic events, audio source separation, and under-resourced languages. He is a Fellow of the Acoustical Society of America, IEEE, and the International Speech Communication Association, and currently serves as Editor-in-Chief of the IEEE Transactions on Audio, Speech and Language.

Research topics

Artificial Intelligence
Computer Science
Speech recognition
Audiology
Mathematics
Engineering
Computer vision

Selected publications

Fine-Tuning Automatic Speech Recognition for People with Parkinson's: An Effective Strategy for Enhancing Speech Technology Accessibility
Interspeech 2022 · 2024 · 13 citations
Senior authorCorresponding
- Computer Science
- Speech recognition
- Artificial Intelligence
This paper enhances dysarthric and dysphonic speech recognition by fine-tuning pretrained automatic speech recognition (ASR) models on the 2023-10-05 data package of the Speech Accessibility Project (SAP), which contains the speech of 253 people with Parkinson's disease.Experiments tested methods that have been effective for Cerebral Palsy, including the use of speaker clustering and severity-dependent models, weighted fine-tuning, and multi-task learning.Best results were obtained using a multi-task learning model, in which the ASR is trained to produce an estimate of the speaker's impairment severity as an auxiliary output.The resulting word error rates are considerably improved relative to a baseline model fine-tuned using only Librispeech data, with word error rate improvements of 37.62% and 26.97% compared to fine-tuning on 100h and 960h of LibriSpeech data, respectively.
DOI
InfantMotion2Vec: Unlabeled Data-Driven Infant Pose Estimation Using a Single Chest IMU
2024 · 1 citations
- Artificial Intelligence
- Computer Science
- Artificial Intelligence
Early identification of neuro-developmental risks in infants is crucial for timely intervention and improved quality of life. Current screening methods are costly, intrusive, and limited by artificial environments or require the infant to wear multiple sensors. To address these challenges, we propose a novel approach leveraging inertial measurement units (IMUs) to monitor infants' spontaneous motor abilities in natural settings. Our method introduces a hierarchical semi-supervised classifier and the InfantMotion2Vec embedding to capture detailed motion patterns, accommodating a wide age range (up to 36 months) while minimizing reliance on labeled data and cumbersome sensor setups. We collected labeled IMU data from 25 families and unlabeled data from 42 families using a single wearable sensor. Pretraining an embedding network using unlabeled data with a hierarchical pose estimator resulted in a 26% increase in F1-score and a 77.7% increase in Cohen's Kappa score compared to using only labeled data. The InfantMotion2Vec embedding adequately handles highly unbalanced labeled data, demonstrating its effectiveness in infant posture classification.
DOI
Sources of Hallucination by Large Language Models on Inference Tasks
2023 · 117 citations
- Computer Science
- Artificial Intelligence
- Natural Language Processing
Large Language Models (LLMs) are claimed to be capable of Natural Language Inference (NLI), necessary for applied tasks like question answering and summarization. We present a series of behavioral studies on several LLM families (LLaMA, GPT-3.5, and PaLM) which probe their behavior using controlled experiments. We establish two biases originating from pretraining which predict much of their behavior, and show that these are major sources of hallucination in generative LLMs. First, memorization at the level of sentences: we show that, regardless of the premise, models falsely label NLI test samples as entailing when the hypothesis is attested in training data, and that entities are used as “indices’ to access the memorized data. Second, statistical patterns of usage learned at the level of corpora: we further show a similar effect when the premise predicate is less frequent than that of the hypothesis in the training data, a bias following from previous studies. We demonstrate that LLMs perform significantly worse on NLI test samples which do not conform to these biases than those which do, and we offer these as valuable controls for future LLM evaluation.
Publisher OA PDF DOI
Towards Robust Family-Infant Audio Analysis Based on Unsupervised Pretraining of Wav2vec 2.0 on Large-Scale Unlabeled Family Audio
arXiv (Cornell University) · 2023 · 1 citations
- Computer Science
- Computer Science
- Artificial Intelligence
To perform automatic family audio analysis, past studies have collected recordings using phone, video, or audio-only recording devices like LENA, investigated supervised learning methods, and used or fine-tuned general-purpose embeddings learned from large pretrained models. In this study, we advance the audio component of a new infant wearable multi-modal device called LittleBeats (LB) by learning family audio representation via wav2vec 2.0 (W2V2) pertaining. We show given a limited number of labeled LB home recordings, W2V2 pretrained using 1k-hour of unlabeled home recordings outperforms oracle W2V2 pretrained on 960-hour unlabeled LibriSpeech in terms of parent/infant speaker diarization (SD) and vocalization classifications (VC) at home. Extra relevant external unlabeled and labeled data further benefit W2V2 pretraining and fine-tuning. With SpecAug and environmental speech corruptions, we obtain 12% relative gain on SD and moderate boost on VC. Code and model weights are available.
DOI
Mitigation of SARS-CoV-2 transmission at a large public university
Nature Communications · 2022 · 33 citations
- Computer Science
- Medicine
- Environmental health
In Fall 2020, universities saw extensive transmission of SARS-CoV-2 among their populations, threatening health of the university and surrounding communities, and viability of in-person instruction. Here we report a case study at the University of Illinois at Urbana-Champaign, where a multimodal "SHIELD: Target, Test, and Tell" program, with other non-pharmaceutical interventions, was employed to keep classrooms and laboratories open. The program included epidemiological modeling and surveillance, fast/frequent testing using a novel low-cost and scalable saliva-based RT-qPCR assay for SARS-CoV-2 that bypasses RNA extraction, called covidSHIELD, and digital tools for communication and compliance. In Fall 2020, we performed >1,000,000 covidSHIELD tests, positivity rates remained low, we had zero COVID-19-related hospitalizations or deaths amongst our university community, and mortality in the surrounding Champaign County was reduced more than 4-fold relative to expected. This case study shows that fast/frequent testing and other interventions mitigated transmission of SARS-CoV-2 at a large public university.
Publisher OA PDF DOI
Antiarrhythmic Hit to Lead Refinement in a Dish Using Patient-Derived iPSC Cardiomyocytes
Journal of Medicinal Chemistry · 2021 · 11 citations
- Chemistry
- Paleontology
selectivity, and decreased avidity for the potassium channel. This study highlights using hiPSC-CMs to guide medicinal chemistry and "drug development in a dish".
Publisher DOI
Emergency ventilator for COVID-19
PLoS ONE · 2020 · 45 citations
- Computer Science
- Computer Science
- Medicine
The COVID-19 pandemic disrupted the world in 2020 by spreading at unprecedented rates and causing tens of thousands of fatalities within a few months. The number of deaths dramatically increased in regions where the number of patients in need of hospital care exceeded the availability of care. Many COVID-19 patients experience Acute Respiratory Distress Syndrome (ARDS), a condition that can be treated with mechanical ventilation. In response to the need for mechanical ventilators, designed and tested an emergency ventilator (EV) that can control a patient's peak inspiratory pressure (PIP) and breathing rate, while keeping a positive end expiratory pressure (PEEP). This article describes the rapid design, prototyping, and testing of the EV. The development process was enabled by rapid design iterations using additive manufacturing (AM). In the initial design phase, iterations between design, AM, and testing enabled a working prototype within one week. The designs of the 16 different components of the ventilator were locked by additively manufacturing and testing a total of 283 parts having parametrically varied dimensions. In the second stage, AM was used to produce 75 functional prototypes to support engineering evaluation and animal testing. The devices were tested over more than two million cycles. We also developed an electronic monitoring system and with automatic alarm to provide for safe operation, along with training materials and user guides. The final designs are available online under a free license. The designs have been transferred to more than 70 organizations in 15 countries. This project demonstrates the potential for ultra-fast product design, engineering, and testing of medical devices needed for COVID-19 emergency response.
Publisher OA PDF DOI

Frequent coauthors

Nancy L. McElwain
University of Illinois Urbana-Champaign
2 shared
Zhengyou Zhang
Tencent (China)
2 shared
Ken Chen
The University of Texas MD Anderson Cancer Center
2 shared
Ming Liu
1 shared
Jont B. Allen
University of Illinois Urbana-Champaign
1 shared
Mübeccel Demırekler
Middle East Technical University
1 shared
Thomas S. Huang
University of Illinois Urbana-Champaign
1 shared
Ming Liu
1 shared

Awards & honors

Fellow of the IEEE (2020)
Fellow of the Acoustical Society of America (2011)
Fellow of the International Speech Communication Association
Individual National Research Service Award, National Institu…
Frederic Vinton Hunt Post-Doctoral Fellowship, Acoustical So…

Similar researchers at University of Illinois Urbana-Champaign

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Mark Hasegawa-Johnson

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you