Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Suma Bhat

Suma Bhat

· ADJ ASST PROFVerified

University of Illinois Urbana-Champaign · Computer Science

Active 2007–2026

h-index19
Citations1.2k
Papers12561 last 5y
Funding$200k
See your match with Suma Bhat — sign in to PhdFit.Sign in

About

Suma Bhat is an Assistant Professor at the Siebel School of Computing and Data Science at the University of Illinois Urbana-Champaign. Her research interests include Natural Language Processing, Human-AI Interaction, and Computational Social Science. She has been recognized for her teaching excellence, receiving the ECE Ronald W. Pratt Faculty Outstanding Teaching Award in 2021. Her work focuses on advancing understanding and development in artificial intelligence, particularly in language processing and human-AI collaboration.

Research topics

  • Computer science
  • Artificial intelligence
  • Natural language processing
  • Linguistics
  • Psychology

Selected publications

  • Examining Students' Code Comprehension with LLMs in Block- and Text-Based Programming

    2026-02-13

    articleOpen access

    Understanding how students reason about code is essential for providing tailored scaffolding in computer science (CS) education. Prior work has used think-aloud protocols with the Structure of the Observed Learning Outcomes (SOLO) taxonomy to examine students' code comprehension and programming levels. However, analyzing such data is labor-intensive and requires expert judgment. Recent advances in large language models (LLMs) offer a promising avenue for scaling this analysis, though their reliability for fine-grained coding remains uncertain. To address this gap, our study investigates the extent to which GPT-5 and 4o can classify SOLO levels and identify code-comprehension strategies from think-aloud transcripts of 27 high-school students working on block-based and text-based tasks. Results show modest alignment with human ratings for SOLO, with one-shot prompting improving agreement over zero-shot, though distinctions between adjacent lower levels (e.g., Prestructural 1 vs. 2) remained difficult. Strategy detection demonstrated stronger performance, achieving accuracies of 75–77% (block) and 62–67% (text), particularly for surface-visible strategies such as 'walkthroughs', 'control-structure identification', and 'pattern recognition', but weaker for less frequent, abstract, meta-cognitive strategies such as 'strategizing' (planning an approach) or 'thoroughness' (systematically checking work). These findings highlight both the potential and the limitations of using GPT-5 and 4o to analyze think-aloud data. While this work represents an initial step, with plans to examine more models, our preliminary results indicate that a human-in-the-loop approach is essential to ensure reliability and interpretive depth. Future work will extend this evaluation to other LLMs to better understand their role in supporting instructional decision-making.

  • Investigating High School Students' Code Comprehension and Strategy Use Across Block-Based and Text-Based Programming

    2026-02-13

    articleOpen access

    Understanding how students comprehend code is essential for designing effective instructional support in computer science (CS). While prior studies have often relied on written responses, few have examined students' reasoning processes through think-aloud data. In this study, we analyzed the verbal reasoning of 27 high school students as they completed block-based and text-based code comprehension tasks targeting loops and conditional statements. Using an adapted SOLO taxonomy framework, we found that most students were classified at lower levels, with performance declining as they transitioned from block-based to text-based code. Students' strategy use, informed by prior work on code comprehension, showed that walkthroughs and identifying program structures were the most common approaches. Text-based tasks more often led students to use pattern-recognition strategies, such as interpreting operators or identifying numerical patterns, whereas block-based tasks occasionally prompted them to articulate broader problem-solving approaches. Overall, these findings demonstrate the value of applying the SOLO taxonomy to evaluate students' programming levels and highlight how programming modality impacts both the depth of understanding and the strategies students employ during code comprehension.

  • Integrating Arithmetic Learning Improves Mathematical Reasoning in Smaller Models

    2026-04-30

    preprintOpen access

    While large models pre-trained on high-quality data exhibit excellent performance on mathematical reasoning (e.g., GSM8k, MultiArith), it remains challenging to specialize smaller models for these tasks. Common approaches to address this challenge include knowledge distillation from large teacher models and data augmentation (e.g., rephrasing questions and generating synthetic solutions). Despite these efforts, smaller models struggle with arithmetic computations, leading to errors in mathematical reasoning. In this work, we leverage a synthetic arithmetic dataset generated programmatically to enhance the reasoning capabilities of smaller models. We investigate two key approaches to incorporate this dataset: (1) intermediate fine-tuning, in which a model is fine-tuned on the arithmetic dataset before training it on a reasoning dataset, and (2) integrating the arithmetic dataset into an instruction-tuning mixture, allowing the model to learn arithmetic skills alongside general instruction-following abilities. Our experiments on multiple reasoning benchmarks demonstrate that incorporating an arithmetic dataset, whether through targeted fine-tuning or within an instruction-tuning mixture, enhances models' arithmetic capabilities, thereby improving their mathematical reasoning performance.

  • A hybrid decision tree with rule-based and deep learning nodes for automated medical coding

    2025-01-01

    dissertationSenior author

    With the growing digitization of healthcare data, automating the medical coding process has become increasingly important. Recent advances in machine learning and natural language processing (NLP) have led to promising approaches for automated medical coding using clinical notes and discharge summaries. Among these, deep learning excels at extracting complex patterns from unstructured text. However, it often requires large annotated datasets, significant computational resources, and lacks interpretability, which are key concerns in clinical settings. In this thesis, we adopt a hierarchical classification structure that mirrors the tree-like organization of the ICD coding system. To offer a scalable and efficient solution, we propose a hybrid decision tree (HDT) framework for automated ICD coding, which combines the efficiency of rule-based methods with the predictive power of deep learning models. Rather than relying on a single paradigm, the HDT approach determines, at each decision node, whether a lightweight rule-based classifier is sufficient or whether a more complex deep learning model is needed. For simpler nodes, where distinguishing features such as specific symptoms or keywords are easily identifiable, we classify medical codes using rule-based methods that apply statistical feature scoring based on term frequency and class-specific relevance. For more complex cases, where textual overlap between conditions makes rule-based classification unreliable, we employ deep learning models, particularly Long Short-Term Memory (LSTM) networks, to capture subtle semantic patterns in clinical text. We evaluate our approach using clinical notes and discharge summaries from the MIMIC-IV dataset. The results demonstrate that HDT offers a favorable trade-off by maintaining high prediction accuracy while significantly reducing inference time and resource consumption. Furthermore, its modular design facilitates system scalability and adaptation to updates in the ICD coding system, making it well-suited for real-world deployment.

  • An LLM-Based Framework for Simulating, Classifying, and Correcting Students' Programming Knowledge with the SOLO Taxonomy

    2025-02-18 · 1 citations

    articleSenior author

    Novice programmers often face challenges in designing computational artifacts and fixing code errors, which can lead to task abandonment and over-reliance on external support. While research has explored effective meta-cognitive strategies to scaffold novice programmers' learning, it is essential to first understand and assess students' conceptual, procedural, and strategic/conditional programming knowledge at scale. To address this issue, we propose a three-model framework that leverages Large Language Models (LLMs) to simulate, classify, and correct student responses to programming questions based on the SOLO Taxonomy. The SOLO Taxonomy provides a structured approach for categorizing student understanding into four levels: Pre-structural, Uni-structural, Multi-structural, and Relational. Our results showed that GPT-4o achieved high accuracy in generating and classifying responses for the Relational category, with moderate accuracy in the Uni-structural and Pre-structural categories, but struggled with the Multi-structural category. The model successfully corrected responses to the Relational level. Although further refinement is needed, these findings suggest that LLMs hold significant potential for supporting computer science education by assessing programming knowledge and guiding students toward deeper cognitive engagement.

  • Medical Students' Perception of Automated Note Feedback After Simulated Encounters

    The Clinical Teacher · 2025-11-17

    articleOpen accessSenior author

    BACKGROUND: Grading medical student patient notes (PNs) is resource-intensive. Natural language processing (NLP) offers a promising solution to automatically grade PNs. We deployed an automated grading system that uses NLP and explored the perceived value of PN feedback. APPROACH: The automated system graded written notes after two standardized patient encounters by third-year medical students. The system generated an individualized report on 'items found' and 'items not found' in the history, physical examination, and diagnosis sections, which was shared with students for feedback via a web-based interface. By rotation, block students received either the automated case feedback first or the faculty-written model note feedback first (the pre-intervention baseline). EVALUATION: After reviewing feedback, students completed surveys for both automated feedback and model note feedback and participated in follow-up focus groups. In total, 44 students received feedback, 37 completed surveys, and 28 participated in focus groups. Qualitative themes that emerged suggested the automated feedback was visually appealing and allowed for easy comparison of items found vs. missing, which would help improve students' documentation skills. Model note appeared trustworthy. IMPLICATIONS: We found automated systems can be a potential tool for formative feedback on note writing activity although in terms of quality it does not surpass the pre-existing feedback methods, such as model note feedback used in our study. Order effects may have influenced these perceptions and the small sample size limits generalizability. Tested software had occasional errors in recognizing a phrase or showing a false positive.

  • Study Partners Matter: Impacts on Inclusion and Outcomes

    2021 ASEE Virtual Annual Conference Content Access Proceedings · 2024-02-20 · 1 citations

    articleOpen accessSenior author

    Her research contributes to the understanding how young students learn mathematics, and the classroom contexts for learning.Her detailed work on teaching practices, teacher learning, and discourse practices in elementary mathematics classrooms has yielded important insights on teaching practices that are linked to student understanding

  • Long-Form Analogy Evaluation Challenge

    2024-01-01 · 2 citations

    articleOpen access

    Given the practical applications of analogies, recent work has studied analogy generation to explain concepts.However, not all generated analogies are of high quality and it is unclear how to measure the quality of this new kind of generated text.To address this challenge, we propose a shared task on automatically evaluating the quality of generated analogies based on seven comprehensive criteria.For this, we will set up a leaderboard based on our dataset annotated with manual ratings along the seven criteria, and provide a baseline solution leveraging GPT-4.We hope that this task would advance the progress in development of new evaluation metrics and methods for analogy generation in natural language, particularly for education.

  • The Relation Among Gender, Language, and Posting Type in Online Chemistry Course Discussion Forums

    2024-03-05 · 1 citations

    articleSenior author

    This study explored gendered language used in an online chemistry course’s discussion forums, to understand how using gendered language might help or hinder learning outcomes, while considering the goal of various posting structures required in the course. Findings revealed that although gendered-language use did not differ between men and women, gendered forms of language were widely used throughout the forums. The use of gendered language appeared strategic, however, and reliably varied by the goal of the discussion post (i.e., posting a solution to a homework problem, asking a question, or answering a question). Ultimately, gender, language and posting type were found to be related to final grade.

  • ElectroVizQA: How well do Multi-modal LLMs perform in Electronics Visual Question Answering?

    arXiv (Cornell University) · 2024-11-27 · 1 citations

    preprintOpen accessSenior author

    Multi-modal Large Language Models (MLLMs) are gaining significant attention for their ability to process multi-modal data, providing enhanced contextual understanding of complex problems. MLLMs have demonstrated exceptional capabilities in tasks such as Visual Question Answering (VQA); however, they often struggle with fundamental engineering problems, and there is a scarcity of specialized datasets for training on topics like digital electronics. To address this gap, we propose a benchmark dataset called ElectroVizQA specifically designed to evaluate MLLMs' performance on digital electronic circuit problems commonly found in undergraduate curricula. This dataset, the first of its kind tailored for the VQA task in digital electronics, comprises approximately 626 visual questions, offering a comprehensive overview of digital electronics topics. This paper rigorously assesses the extent to which MLLMs can understand and solve digital electronic circuit questions, providing insights into their capabilities and limitations within this specialized domain. By introducing this benchmark dataset, we aim to motivate further research and development in the application of MLLMs to engineering education, ultimately bridging the performance gap and enhancing the efficacy of these models in technical fields.

Recent grants

Frequent coauthors

  • Hongyu Gong

    31 shared
  • Pramod Viswanath

    22 shared
  • Michelle Perry

    University of Illinois Urbana-Champaign

    17 shared
  • Ziheng Zeng

    15 shared
  • Jianing Zhou

    15 shared
  • Tarek Sakakini

    University of Illinois Urbana-Champaign

    14 shared
  • Wanzheng Zhu

    University of Leeds

    12 shared
  • Jiaqi Mu

    11 shared

Education

  • Ph.D, Electrical and Computer Engineering

    University of Illinois Urbana-Champaign

    2010
  • MA, South and Southeast Asian Studies

    University of California, Berkeley

    2000
  • M.E, Electrical Engineering

    Indian Institute of Science Bangalore

    1996
  • BS, Statistics

    Mangalore University

    1992

Awards & honors

  • ECE Ronald W. Pratt Faculty Outstanding Teaching Award (07/0…
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Suma Bhat

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup