Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
ChengXiang Zhai

ChengXiang Zhai

· Donald Biggar Willett Professor in EngineeringVerified

University of Illinois Urbana-Champaign · Computer Science

Active 1990–2026

h-index83
Citations29.8k
Papers553126 last 5y
Funding$1.9M
See your match with ChengXiang Zhai — sign in to PhdFit.Sign in

About

ChengXiang Zhai is the Donald Biggar Willett Professor in Engineering at the University of Illinois Urbana-Champaign, affiliated with the Siebel School of Computing and Data Science. His research areas include Artificial Intelligence, Bioinformatics and Computational Biology, Computers and Education, and Data and Information Systems. He has received numerous awards for his research and teaching, including the ACM SIGIR Gerard Salton Award in 2021, the ACM SIGIR Academy Membership in 2020, and the Presidential Early Career Award for Scientists and Engineers in 2004. Zhai has also been recognized for excellence in graduate student mentoring and undergraduate advising, and has received multiple teaching awards. His professional contributions are distinguished by his leadership in research and education within the field of computing and data science.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Machine Learning
  • Engineering
  • Data Mining
  • Natural Language Processing
  • Computational biology
  • Biology
  • Epistemology
  • Philosophy
  • Biochemical engineering
  • Linguistics
  • Data science

Selected publications

  • Globally Optimal Training of Spiking Neural Networks via Parameter Reconstruction

    ArXiv.org · 2026-05-08

    articleOpen accessSenior author

    Spiking Neural Networks (SNNs) have been proposed as biologically plausible and energy-efficient alternatives to conventional Artificial Neural Networks (ANNs). However, the training of SNN usually relies on surrogate gradients due to the non-differentiability of the spike function, introducing approximation errors that accumulate across layers. To address this challenge, we extend the work on convexification of parallel feedforward threshold networks to parallel recurrent threshold networks, which subsume parallel SNNs as a structured special case. Building on this theoretical framework, we propose a parameter reconstruction algorithm for SNN training that demonstrates consistent and significant advantages across various tasks, both as a standalone method and in combination with surrogate-gradient training. The ablations further demonstrate the data scalability and robustness to model configurations of our training algorithm, pointing toward its potential in large-scale SNN training.

  • Globally Optimal Training of Spiking Neural Networks via Parameter Reconstruction

    arXiv (Cornell University) · 2026-05-08

    preprintOpen accessSenior author

    Spiking Neural Networks (SNNs) have been proposed as biologically plausible and energy-efficient alternatives to conventional Artificial Neural Networks (ANNs). However, the training of SNN usually relies on surrogate gradients due to the non-differentiability of the spike function, introducing approximation errors that accumulate across layers. To address this challenge, we extend the work on convexification of parallel feedforward threshold networks to parallel recurrent threshold networks, which subsume parallel SNNs as a structured special case. Building on this theoretical framework, we propose a parameter reconstruction algorithm for SNN training that demonstrates consistent and significant advantages across various tasks, both as a standalone method and in combination with surrogate-gradient training. The ablations further demonstrate the data scalability and robustness to model configurations of our training algorithm, pointing toward its potential in large-scale SNN training.

  • SimLab: A Platform for Simulation-based Evaluation of Conversational Information Access Systems

    ArXiv.org · 2025-07-07

    preprintOpen accessSenior author

    Progress in conversational information access (CIA) systems has been hindered by the difficulty of evaluating such systems with reproducible experiments. While user simulation offers a promising solution, the lack of infrastructure and tooling to support this evaluation paradigm remains a significant barrier. To address this gap, we introduce SimLab, the first cloud-based platform providing a centralized solution for the community to benchmark both conversational systems and user simulators in a controlled and reproducible setting. We articulate the requirements for such a platform and propose a general infrastructure to meet them. We then present the design and implementation of an initial version of SimLab and showcase its features through an initial simulation-based evaluation task in conversational movie recommendation. Furthermore, we discuss the platform's sustainability and future opportunities for development, inviting the community to drive further progress in the fields of CIA and user simulation.

  • InstInfo: A Just-in-Time Literature Recommendation System for Presentations

    2025-07-13

    articleOpen accessSenior author

    The efficient discovery of academic literature is critical for research progress, yet many researchers have difficulties in finding literature. This work proposes InstInfo: a novel just-in-time literature recommendation system for presentations. InstInfo transcribes audio in real-time and recommends literature according to the ideas being discussed, thereby helping researchers ground presentations in academic literature while saving them the time of having to manually search. Informal usability studies show that InstInfo is easy to use and that researchers find value in the recommendations. InstInfo can be accessed at https://instinfo.com.

  • Knowledge-Centered Dual-Process Reasoning for Math Word Problems With Large Language Models

    IEEE Transactions on Knowledge and Data Engineering · 2025-04-01 · 6 citations

    article

    Math word problem (MWP) serves as a critical milestone for assessing the text mining ability and knowledge mastery level of models. Recent advancements have witnessed large language models (LLMs) showcasing remarkable performance on MWP. However, current LLMs still frequently exhibit logical errors, which highlights their inability to fully grasp the knowledge required for genuine step-by-step mathematical reasoning. To this end, in this paper, we propose a novel Knowledge-guided Solver (KNOS) framework that empowers LLMs to simulate human mathematical reasoning, whose core idea is to <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Invoke-Verify-Inject</i> necessary knowledge to solve MWP. We draw inspiration from the dual-process theory to construct two cooperative systems: a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Knowledge System</i> and an <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Inference System</i>. Specifically, the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Knowledge System</i> employs LLMs as the knowledge base and develops a novel <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">knowledge invoker</i> that can elicit their relevant knowledge to support the strict step-level mathematical reasoning. In the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Inference System</i>, we propose a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">knowledge verifier</i> and a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">knowledge injector</i> to evaluate the knowledge rationality and further guide the step-wise symbolic deduction in an interpretable manner based on human cognitive mechanism, respectively. Moreover, to tackle the potential scarcity issue of mathematics-specific knowledge in LLMs, we consider an open-book exam scenario and propose an improved version of KNOS called EKNOS. In EKNOS, we meticulously design <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">knowledge selectors</i> to extract the most relevant commonsense and math formulas from external knowledge sources for each reasoning step. This knowledge is utilized to assist the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">knowledge invoker</i> in better stimulating LLMs’ reasoning abilities. Both KNOS and EKNOS are flexible to empower different LLMs. Our experiments with GPT3, ChatGPT, and GPT4 not only demonstrate their reasoning accuracy improvement but also show how they bring the strict step-wise interpretability of mathematical thinking.

  • The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination

    ArXiv.org · 2025-02-22 · 1 citations

    preprintOpen access

    Hallucination is a persistent challenge in large language models (LLMs), where even with rigorous quality control, models often generate distorted facts. This paradox, in which error generation continues despite high-quality training data, calls for a deeper understanding of the underlying LLM mechanisms. To address it, we propose a novel concept: knowledge overshadowing, where model's dominant knowledge can obscure less prominent knowledge during text generation, causing the model to fabricate inaccurate details. Building on this idea, we introduce a novel framework to quantify factual hallucinations by modeling knowledge overshadowing. Central to our approach is the log-linear law, which predicts that the rate of factual hallucination increases linearly with the logarithmic scale of (1) Knowledge Popularity, (2) Knowledge Length, and (3) Model Size. The law provides a means to preemptively quantify hallucinations, offering foresight into their occurrence even before model training or inference. Built on overshadowing effect, we propose a new decoding strategy CoDa, to mitigate hallucinations, which notably enhance model factuality on Overshadow (27.9%), MemoTrap (13.1%) and NQ-Swap (18.3%). Our findings not only deepen understandings of the underlying mechanisms behind hallucinations but also provide actionable insights for developing more predictable and controllable language models.

  • Learning to Slice: Self-Supervised Interpretable Hierarchical Representation Learning with Graph Auto-Encoder Tree

    2025-08-01

    articleOpen access

    The perceptions and decisions of individuals on social networks are deeply rooted in their intrinsic beliefs, which makes it possible to infer social beliefs from user behavior and message interactions. While existing research models these interactions as graphs and learns their representations, interpretability remains a significant challenge. In real-world scenarios, the interpretation of beliefs is nested within subject scopes of different granularity (such as topics and locations), posing additional challenges for belief discovery. In this paper, we introduce the Interpretable Graph Auto-Encoder Tree (IGAT), a novel end-to-end framework that jointly encodes hierarchical subject scopes and corresponding beliefs as a unified, interpretable hierarchical representation. IGAT integrates the interpretable hierarchy of Model Trees with disentangled representation learning models. We propose a differentiable Slice Mechanism to dynamically optimize internal node splitting and jointly train a leaf model to learn disentangled belief subspaces. The aggregation of these subspaces yields a unified representation, offering interpretations for both subjects and beliefs. Experimental evaluations on three real-world Twitter datasets show that IGAT achieves a consistent improvement of 1.49%-5.61% in F1-score, accuracy, and purity in the belief discovery task, as well as its effectiveness in various downstream analytical applications.

  • Interactive Information Need Prediction with Intent and Context

    arXiv (Cornell University) · 2025-01-05

    preprintOpen accessSenior author

    The ability to predict a user's information need would have wide-ranging implications, from saving time and effort to mitigating vocabulary gaps. We study how to interactively predict a user's information need by letting them select a pre-search context (e.g., a paragraph, sentence, or singe word) and specify an optional partial search intent (e.g., "how", "why", "applications", etc.). We examine how various generative language models can explicitly make this prediction by generating a question as well as how retrieval models can implicitly make this prediction by retrieving an answer. We find that this prediction process is possible in many cases and that user-provided partial search intent can help mitigate large pre-search contexts. We conclude that this framework is promising and suitable for real-world applications.

  • Cache-of-Thought: Master-Apprentice Framework for Cost-Effective Vision Language Model Reasoning

    2025-01-01

    articleOpen access

    Mingyuan Wu, Jize Jiang, Haozhen Zheng, Meitang Li, Zhaoheng Li, Beitong Tian, Bo Chen, Yongjoo Park, Minjia Zhang, ChengXiang Zhai, Klara Nahrstedt. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.

  • Information Retrieval for Artificial General Intelligence: A New Perspective of Information Retrieval Research

    2025-07-13

    article1st authorCorresponding

Recent grants

Frequent coauthors

Labs

  • Siebel School of Computing and Data SciencePI

Education

  • Ph.D., Computer Science

    University of Illinois at Urbana-Champaign

    2003
  • M.S., Computer Science

    University of Illinois at Urbana-Champaign

    1999
  • B.S., Computer Science

    University of Science and Technology of China

    1996

Awards & honors

  • Campus Award for Excellence in Graduate Student Mentoring, U…
  • Rose Award for Teaching Excellence, College of Engineering,…
  • ACM SIGIR Gerard Salton Award (2021)
  • ACM SIGIR Academy Member (2020)
  • Donald Biggar Willett Professor in Engineering (2018)
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with ChengXiang Zhai

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup