Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Steven Bethard

Steven Bethard

· Assistant Professor, School of InformationVerified

University of Arizona · Computer Science

Active 2002–2025

h-index48
Citations15.5k
Papers20870 last 5y
Funding$5.0M
See your match with Steven Bethard — sign in to PhdFit.Sign in

About

Steven Bethard is a professor whose research focuses on natural language processing, information extraction, and related areas within computer science. His work involves developing algorithms and models to improve the extraction and understanding of structured information from unstructured text, leveraging large language models and transformer-based architectures. Bethard's academic background includes supervising a diverse group of students across undergraduate, master's, doctoral, and post-doctoral levels, with dissertations and theses addressing topics such as structured information extraction, geocoding, linguistic knowledge probing, bias detection, and neural network algorithms for ontology-informed information extraction. His contributions aim to advance the capabilities of machine understanding of language and enhance applications in complex reasoning, data quality, and information retrieval.

Research topics

  • Artificial Intelligence
  • Computer Science
  • Natural Language Processing
  • Machine Learning
  • Data Mining
  • World Wide Web
  • Speech recognition
  • Algorithm
  • Psychology
  • Statistics
  • Mathematics
  • Programming language
  • Engineering

Selected publications

  • A Semantic Parsing Framework for End-to-End Time Normalization

    ArXiv.org · 2025-07-08

    preprintOpen accessSenior author

    Time normalization is the task of converting natural language temporal expressions into machine-readable representations. It underpins many downstream applications in information retrieval, question answering, and clinical decision-making. Traditional systems based on the ISO-TimeML schema limit expressivity and struggle with complex constructs such as compositional, event-relative, and multi-span time expressions. In this work, we introduce a novel formulation of time normalization as a code generation task grounded in the SCATE framework, which defines temporal semantics through symbolic and compositional operators. We implement a fully executable SCATE Python library and demonstrate that large language models (LLMs) can generate executable SCATE code. Leveraging this capability, we develop an automatic data augmentation pipeline using LLMs to synthesize large-scale annotated data with code-level validation. Our experiments show that small, locally deployable models trained on this augmented data can achieve strong performance, outperforming even their LLM parents and enabling practical, accurate, and interpretable time normalization.

  • Author response for "Quantifying the substantive influence of public comment on United States federal environmental decisions under NEPA"

    2025-05-20

    peer-review
  • Author response for "Quantifying the substantive influence of public comment on United States federal environmental decisions under NEPA"

    2025-03-03

    peer-review
  • Applying Transformer Architectures to Detect Cynical Comments in Spanish Social Media

    2025-01-01

    articleOpen access

    Detecting cynical comments in online communication poses a significant challenge in humancomputer interaction, especially given the massive proliferation of discussions on platforms like YouTube.These comments often include offensive or disruptive patterns, such as sarcasm, negative feelings, specific reasons, and an attitude of being right.To address this problem, we present a web platform for the Spanish language that has been developed and leverages natural language processing and machine learning techniques.The platform detects comments and provides valuable information to users by focusing on analyzing comments.The core models are based on pre-trained architectures, including BETO, SpanBERTa, Multilingual BERT, RoBERTuito, and BERT, enabling robust detection of cynical comments.Our platform was trained and tested with Spanish comments from car analysis channels on YouTube.The results show that models achieve performance above 0.8 F1 for all types of cynical comments in the text classification task but achieve lower performance (around 0.6-0.7 F1) for the more arduous token classification task.

  • Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information

    SSRN Electronic Journal · 2025-01-01 · 1 citations

    preprintOpen access
  • Quantifying the substantive influence of public comment on United States federal environmental decisions under NEPA

    Environmental Research Letters · 2025-05-30

    articleOpen access

    Abstract A citizen’s right to comment on, and criticize, government decisions makes a difference. The U.S. National Environmental Policy Act of 1969 (NEPA) institutionalized public engagement in environmental review in the belief it would lead to better decisions and more sustainable outcomes. But, 50 years later, NEPA’s public comment process has been criticized as costly and slow, while doing little to change outcomes. Data science now makes it possible to track progress and evaluate the influence of public participation. We examined 108 environmental impact statement (EIS) processes spanning 22 years. Our analysis revealed that public comments resulted in substantive decision alterations in 62% of cases, with 64% showing modifications to alternatives, 42% showing modifications to mitigation plans and 11% leading to the selection of an entirely new preferred alternative. When federal agencies changed project alternatives (78 EISs), 88% of the time (69 of the 78 EISs) they credited public comments as the reason. In 45 of the 108 EISs, agencies modified mitigation plans and credited public comments as the reason 100% of the time. Agencies only occasionally selected a new preferred alternative (21 out of 104 EISs), but when they did, they credited public comments as the reason 100% of the time. As the United States and the 190+ states and countries that have adopted NEPA’s example consider how to address environmental change, it is important to assess the role of public participation in environmental decision making. Our data say public comments matter.

  • Transformer-Based Temporal Information Extraction and Application: A Review

    2025-01-01 · 2 citations

    articleOpen accessSenior author
  • Identifying task groupings for multi-task learning using pointwise V-usable information

    Journal of Biomedical Informatics · 2025-07-16 · 2 citations

    article
  • Improving Toponym Resolution by Predicting Attributes to Constrain Geographical Ontology Entries

    2024-01-01 · 2 citations

    articleOpen accessSenior author

    Zeyu Zhang, Egoitz Laparra, Steven Bethard. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers). 2024.

  • Semi-Structured Chain-of-Thought: Integrating Multiple Sources of Knowledge for Improved Language Model Reasoning

    2024-01-01 · 2 citations

    articleOpen access

    Xin Su, Tiep Le, Steven Bethard, Phillip Howard. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024.

Recent grants

Frequent coauthors

  • James Pustejovsky

    60 shared
  • Guergana Savova

    Harvard University

    55 shared
  • Leon Derczynski

    43 shared
  • Marc Verhagen

    Brandeis University

    43 shared
  • Timothy A. Miller

    35 shared
  • Wei-Te Chen

    25 shared
  • Dmitriy Dligach

    25 shared
  • Chen Lin

    Shanghai Artificial Intelligence Laboratory

    24 shared

Labs

Education

  • Ph.D., Computer Science and Cognitive Science

    University of Colorado Boulder

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Steven Bethard

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup