ChengXiang Zhai
· Donald Biggar Willett Professor in EngineeringVerifiedUniversity of Illinois Urbana-Champaign · Computer Science
Active 1990–2026
About
ChengXiang Zhai is the Donald Biggar Willett Professor in Engineering at the University of Illinois Urbana-Champaign, affiliated with the Siebel School of Computing and Data Science. His research areas include Artificial Intelligence, Bioinformatics and Computational Biology, Computers and Education, and Data and Information Systems. He has received numerous awards for his research and teaching, including the ACM SIGIR Gerard Salton Award in 2021, the ACM SIGIR Academy Membership in 2020, and the Presidential Early Career Award for Scientists and Engineers in 2004. Zhai has also been recognized for excellence in graduate student mentoring and undergraduate advising, and has received multiple teaching awards. His professional contributions are distinguished by his leadership in research and education within the field of computing and data science.
Research topics
- Computer Science
- Artificial Intelligence
- Machine Learning
- Engineering
- Data Mining
- Natural Language Processing
- Computational biology
- Biology
- Epistemology
- Philosophy
- Biochemical engineering
- Linguistics
- Data science
Selected publications
Globally Optimal Training of Spiking Neural Networks via Parameter Reconstruction
ArXiv.org · 2026-05-08
articleOpen accessSenior authorSpiking Neural Networks (SNNs) have been proposed as biologically plausible and energy-efficient alternatives to conventional Artificial Neural Networks (ANNs). However, the training of SNN usually relies on surrogate gradients due to the non-differentiability of the spike function, introducing approximation errors that accumulate across layers. To address this challenge, we extend the work on convexification of parallel feedforward threshold networks to parallel recurrent threshold networks, which subsume parallel SNNs as a structured special case. Building on this theoretical framework, we propose a parameter reconstruction algorithm for SNN training that demonstrates consistent and significant advantages across various tasks, both as a standalone method and in combination with surrogate-gradient training. The ablations further demonstrate the data scalability and robustness to model configurations of our training algorithm, pointing toward its potential in large-scale SNN training.
Globally Optimal Training of Spiking Neural Networks via Parameter Reconstruction
arXiv (Cornell University) · 2026-05-08
preprintOpen accessSenior authorSpiking Neural Networks (SNNs) have been proposed as biologically plausible and energy-efficient alternatives to conventional Artificial Neural Networks (ANNs). However, the training of SNN usually relies on surrogate gradients due to the non-differentiability of the spike function, introducing approximation errors that accumulate across layers. To address this challenge, we extend the work on convexification of parallel feedforward threshold networks to parallel recurrent threshold networks, which subsume parallel SNNs as a structured special case. Building on this theoretical framework, we propose a parameter reconstruction algorithm for SNN training that demonstrates consistent and significant advantages across various tasks, both as a standalone method and in combination with surrogate-gradient training. The ablations further demonstrate the data scalability and robustness to model configurations of our training algorithm, pointing toward its potential in large-scale SNN training.
SimLab: A Platform for Simulation-based Evaluation of Conversational Information Access Systems
ArXiv.org · 2025-07-07
preprintOpen accessSenior authorProgress in conversational information access (CIA) systems has been hindered by the difficulty of evaluating such systems with reproducible experiments. While user simulation offers a promising solution, the lack of infrastructure and tooling to support this evaluation paradigm remains a significant barrier. To address this gap, we introduce SimLab, the first cloud-based platform providing a centralized solution for the community to benchmark both conversational systems and user simulators in a controlled and reproducible setting. We articulate the requirements for such a platform and propose a general infrastructure to meet them. We then present the design and implementation of an initial version of SimLab and showcase its features through an initial simulation-based evaluation task in conversational movie recommendation. Furthermore, we discuss the platform's sustainability and future opportunities for development, inviting the community to drive further progress in the fields of CIA and user simulation.
InstInfo: A Just-in-Time Literature Recommendation System for Presentations
2025-07-13
articleOpen accessSenior authorThe efficient discovery of academic literature is critical for research progress, yet many researchers have difficulties in finding literature. This work proposes InstInfo: a novel just-in-time literature recommendation system for presentations. InstInfo transcribes audio in real-time and recommends literature according to the ideas being discussed, thereby helping researchers ground presentations in academic literature while saving them the time of having to manually search. Informal usability studies show that InstInfo is easy to use and that researchers find value in the recommendations. InstInfo can be accessed at https://instinfo.com.
Knowledge-Centered Dual-Process Reasoning for Math Word Problems With Large Language Models
IEEE Transactions on Knowledge and Data Engineering · 2025-04-01 · 6 citations
articleMath word problem (MWP) serves as a critical milestone for assessing the text mining ability and knowledge mastery level of models. Recent advancements have witnessed large language models (LLMs) showcasing remarkable performance on MWP. However, current LLMs still frequently exhibit logical errors, which highlights their inability to fully grasp the knowledge required for genuine step-by-step mathematical reasoning. To this end, in this paper, we propose a novel Knowledge-guided Solver (KNOS) framework that empowers LLMs to simulate human mathematical reasoning, whose core idea is to <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Invoke-Verify-Inject</i> necessary knowledge to solve MWP. We draw inspiration from the dual-process theory to construct two cooperative systems: a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Knowledge System</i> and an <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Inference System</i>. Specifically, the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Knowledge System</i> employs LLMs as the knowledge base and develops a novel <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">knowledge invoker</i> that can elicit their relevant knowledge to support the strict step-level mathematical reasoning. In the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Inference System</i>, we propose a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">knowledge verifier</i> and a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">knowledge injector</i> to evaluate the knowledge rationality and further guide the step-wise symbolic deduction in an interpretable manner based on human cognitive mechanism, respectively. Moreover, to tackle the potential scarcity issue of mathematics-specific knowledge in LLMs, we consider an open-book exam scenario and propose an improved version of KNOS called EKNOS. In EKNOS, we meticulously design <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">knowledge selectors</i> to extract the most relevant commonsense and math formulas from external knowledge sources for each reasoning step. This knowledge is utilized to assist the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">knowledge invoker</i> in better stimulating LLMs’ reasoning abilities. Both KNOS and EKNOS are flexible to empower different LLMs. Our experiments with GPT3, ChatGPT, and GPT4 not only demonstrate their reasoning accuracy improvement but also show how they bring the strict step-wise interpretability of mathematical thinking.
ArXiv.org · 2025-02-22 · 1 citations
preprintOpen accessHallucination is a persistent challenge in large language models (LLMs), where even with rigorous quality control, models often generate distorted facts. This paradox, in which error generation continues despite high-quality training data, calls for a deeper understanding of the underlying LLM mechanisms. To address it, we propose a novel concept: knowledge overshadowing, where model's dominant knowledge can obscure less prominent knowledge during text generation, causing the model to fabricate inaccurate details. Building on this idea, we introduce a novel framework to quantify factual hallucinations by modeling knowledge overshadowing. Central to our approach is the log-linear law, which predicts that the rate of factual hallucination increases linearly with the logarithmic scale of (1) Knowledge Popularity, (2) Knowledge Length, and (3) Model Size. The law provides a means to preemptively quantify hallucinations, offering foresight into their occurrence even before model training or inference. Built on overshadowing effect, we propose a new decoding strategy CoDa, to mitigate hallucinations, which notably enhance model factuality on Overshadow (27.9%), MemoTrap (13.1%) and NQ-Swap (18.3%). Our findings not only deepen understandings of the underlying mechanisms behind hallucinations but also provide actionable insights for developing more predictable and controllable language models.
2025-08-01
articleOpen accessThe perceptions and decisions of individuals on social networks are deeply rooted in their intrinsic beliefs, which makes it possible to infer social beliefs from user behavior and message interactions. While existing research models these interactions as graphs and learns their representations, interpretability remains a significant challenge. In real-world scenarios, the interpretation of beliefs is nested within subject scopes of different granularity (such as topics and locations), posing additional challenges for belief discovery. In this paper, we introduce the Interpretable Graph Auto-Encoder Tree (IGAT), a novel end-to-end framework that jointly encodes hierarchical subject scopes and corresponding beliefs as a unified, interpretable hierarchical representation. IGAT integrates the interpretable hierarchy of Model Trees with disentangled representation learning models. We propose a differentiable Slice Mechanism to dynamically optimize internal node splitting and jointly train a leaf model to learn disentangled belief subspaces. The aggregation of these subspaces yields a unified representation, offering interpretations for both subjects and beliefs. Experimental evaluations on three real-world Twitter datasets show that IGAT achieves a consistent improvement of 1.49%-5.61% in F1-score, accuracy, and purity in the belief discovery task, as well as its effectiveness in various downstream analytical applications.
Interactive Information Need Prediction with Intent and Context
arXiv (Cornell University) · 2025-01-05
preprintOpen accessSenior authorThe ability to predict a user's information need would have wide-ranging implications, from saving time and effort to mitigating vocabulary gaps. We study how to interactively predict a user's information need by letting them select a pre-search context (e.g., a paragraph, sentence, or singe word) and specify an optional partial search intent (e.g., "how", "why", "applications", etc.). We examine how various generative language models can explicitly make this prediction by generating a question as well as how retrieval models can implicitly make this prediction by retrieving an answer. We find that this prediction process is possible in many cases and that user-provided partial search intent can help mitigate large pre-search contexts. We conclude that this framework is promising and suitable for real-world applications.
Cache-of-Thought: Master-Apprentice Framework for Cost-Effective Vision Language Model Reasoning
2025-01-01
articleOpen accessMingyuan Wu, Jize Jiang, Haozhen Zheng, Meitang Li, Zhaoheng Li, Beitong Tian, Bo Chen, Yongjoo Park, Minjia Zhang, ChengXiang Zhai, Klara Nahrstedt. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.
2025-07-13
article1st authorCorresponding
Recent grants
III-COR: QueryClinic: Improve Search Accuracy for Difficult Queries
NSF · $397k · 2007–2011
NSF · $500k · 2010–2015
CAREER: User-centered Adaptive Information Retrieval
NSF · $522k · 2004–2010
NSF · $300k · 2018–2023
RI: Multi-Faceted Comparative Text Summarization
NSF · $200k · 2007–2010
Frequent coauthors
- 26 shared
Jiawei Han
University of Illinois Urbana-Champaign
- 26 shared
Hui Fang
First Affiliated Hospital of Jiangxi Medical College
- 25 shared
Sean Massung
- 24 shared
Qiaozhu Mei
- 22 shared
Heng Ji
- 18 shared
Shanfeng Zhu
Fudan University
- 18 shared
Shengwen Peng
Fudan University
- 18 shared
Hiroshi Mamitsuka
Kyoto University
Labs
Siebel School of Computing and Data SciencePI
Education
- 2003
Ph.D., Computer Science
University of Illinois at Urbana-Champaign
- 1999
M.S., Computer Science
University of Illinois at Urbana-Champaign
- 1996
B.S., Computer Science
University of Science and Technology of China
Awards & honors
- Campus Award for Excellence in Graduate Student Mentoring, U…
- Rose Award for Teaching Excellence, College of Engineering,…
- ACM SIGIR Gerard Salton Award (2021)
- ACM SIGIR Academy Member (2020)
- Donald Biggar Willett Professor in Engineering (2018)
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with ChengXiang Zhai
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup