Hari Sundaram

· ProfessorVerified

University of Illinois Urbana-Champaign · Computer Science

Active 1997–2026

h-index35

Citations5.5k

Papers24743 last 5y

Funding$174k

Faculty page

See your match with Hari Sundaram — sign in to PhdFit.Sign in

About

Hari Sundaram is a professor at the Siebel School of Computing and Data Science at the University of Illinois Urbana-Champaign. He received his Ph.D. in Electrical Engineering from Columbia University in 2002, under the supervision of Shih-Fu Chang. His academic career includes co-founding the School of Arts, Media and Engineering at Arizona State University, where he served as Associate Director from 2009 to 2012. He joined the University of Illinois in 2014 with a joint appointment between the departments of Computer Science and the Charles H. Sandage Department of Advertising. Sundaram's research spans applied machine learning, network science, and human-computer interaction. He develops algorithms and builds systems that help individuals understand and act, with a focus on constraining system and algorithm design to account for human cognition limits. His major contributions in multimedia computing include a feedback-control algorithm for a real-time mixed-reality system aimed at stroke patient rehabilitation. In network science, he has developed algorithms for identifying homogenous communities within large heterogeneous networks, detecting collective behavior, and efficiently sampling large graphs. His current research is motivated by large-scale collective-action problems such as public health and climate change, where he works on developing algorithms for robust inference of behavior, mechanisms to improve social welfare, message synthesis, and computational infrastructure for large-scale field experiments.

Research topics

Sociology
Computer Science
Computer Security
Business
Transport engineering
Accounting
Advertising
Environmental health
Gerontology
Psychology
Marketing
Psychiatry
Medicine
Demography
Finance
Engineering
Internet privacy

Selected publications

Masking or Mitigating? Deconstructing the Impact of Query Rewriting on Retriever Biases in RAG
ArXiv.org · 2026-04-07
articleOpen accessSenior author
Dense retrievers in retrieval-augmented generation (RAG) systems exhibit systematic biases -- including brevity, position, literal matching, and repetition biases -- that can compromise retrieval quality. Query rewriting techniques are now standard in RAG pipelines, yet their impact on these biases remains unexplored. We present the first systematic study of how query enhancement techniques affect dense retrieval biases, evaluating five methods across six retrievers. Our findings reveal that simple LLM-based rewriting achieves the strongest aggregate bias reduction (54\%), yet fails under adversarial conditions where multiple biases combine. Mechanistic analysis uncovers two distinct mechanisms: simple rewriting reduces bias through increased score variance, while pseudo-document generation methods achieve reduction through genuine decorrelation from bias-inducing features. However, no technique uniformly addresses all biases, and effects vary substantially across retrievers. Our results provide practical guidance for selecting query enhancement strategies based on specific bias vulnerabilities. More broadly, we establish a taxonomy distinguishing query-document interaction biases from document encoding biases, clarifying the limits of query-side interventions for debiasing RAG systems.
Publisher OA PDF
Living Contracts: Beyond Document-Centric Interaction with Legal Agreements
Open MIND · 2026-02-01
preprint
User interaction with legal contracts has been limited to document reading, which is often complicated by complex, ambiguous legal language. We explore possible futures where contract interfaces go beyond single document interfaces to (1) educate users with legal rights not stated in the contract, (2) transform legal language into alternative representations to aid information tasks before, during, and after signing, and (3) proactively supply contractual information at relevant moments. We refer to these future interfaces collectively as Living Contracts. Using residential leases as a case study, we created three design probes representing different possible Living Contracts. A three-part qualitative study (N=18) revealed participants' barriers to interacting with contracts, including interpreting complex language, uncertainty about legal rights, and the pressure to sign quickly. Participants' feedback on the probes highlighted how Living Contracts have the potential to address these challenges and open new design opportunities for human-contract interactions beyond document reading.
DOI
Masking or Mitigating? Deconstructing the Impact of Query Rewriting on Retriever Biases in RAG
arXiv (Cornell University) · 2026-04-07
preprintOpen accessSenior author
Dense retrievers in retrieval-augmented generation (RAG) systems exhibit systematic biases -- including brevity, position, literal matching, and repetition biases -- that can compromise retrieval quality. Query rewriting techniques are now standard in RAG pipelines, yet their impact on these biases remains unexplored. We present the first systematic study of how query enhancement techniques affect dense retrieval biases, evaluating five methods across six retrievers. Our findings reveal that simple LLM-based rewriting achieves the strongest aggregate bias reduction (54\%), yet fails under adversarial conditions where multiple biases combine. Mechanistic analysis uncovers two distinct mechanisms: simple rewriting reduces bias through increased score variance, while pseudo-document generation methods achieve reduction through genuine decorrelation from bias-inducing features. However, no technique uniformly addresses all biases, and effects vary substantially across retrievers. Our results provide practical guidance for selecting query enhancement strategies based on specific bias vulnerabilities. More broadly, we establish a taxonomy distinguishing query-document interaction biases from document encoding biases, clarifying the limits of query-side interventions for debiasing RAG systems.
Publisher DOI
AI Psychosis: Does Conversational AI Amplify Delusion-Related Language?
arXiv (Cornell University) · 2026-03-20
preprintOpen access
Conversational AI systems are increasingly used for personal reflection and emotional disclosure, raising concerns about their effects on vulnerable users. Recent anecdotal reports suggest that prolonged interactions with AI may reinforce delusional thinking -- a phenomenon sometimes described as AI Psychosis. However, empirical evidence on this phenomenon remains limited. In this work, we examine how delusion-related language evolves during multi-turn interactions with conversational AI. We construct simulated users (SimUsers) from Reddit users' longitudinal posting histories and generate extended conversations with three model families (GPT, LLaMA, and Qwen). We develop DelusionScore, a linguistic measure that quantifies the intensity of delusion-related language across conversational turns. We find that SimUsers derived from users with prior delusion-related discourse (Treatment) exhibit progressively increasing DelusionScore trajectories, whereas those derived from users without such discourse (Control) remain stable or decline. We further find that this amplification varies across themes, with reality skepticism and compulsive reasoning showing the strongest increases. Finally, conditioning AI responses on current DelusionScore substantially reduces these trajectories. These findings provide empirical evidence that conversational AI interactions can amplify delusion-related language over extended use and highlight the importance of state-aware safety mechanisms for mitigating such risks.
Publisher DOI
Living Contracts: Beyond Document-Centric Interaction with Legal Agreements
2026-04-13 · 1 citations
articleOpen access
User interaction with legal contracts has been limited to document reading, which is often complicated by complex, ambiguous legal language. We explore possible futures where contract interfaces go beyond single document interfaces to (1) educate users with legal rights not stated in the contract, (2) transform legal language into alternative representations to aid information tasks before, during, and after signing, and (3) proactively supply contractual information at relevant moments. We refer to these future interfaces collectively as Living Contracts. Using residential leases as a case study, we created three design probes representing different possible Living Contracts. A three-part qualitative study (N=18) revealed participants’ barriers to interacting with contracts, including interpreting complex language, uncertainty about legal rights, and the pressure to sign quickly. Participants’ feedback on the probes highlighted how Living Contracts have the potential to address these challenges and open new design opportunities for human-contract interactions beyond document reading.
Publisher DOI
AI Psychosis: Does Conversational AI Amplify Delusion-Related Language?
ArXiv.org · 2026-03-20
articleOpen access
Conversational AI systems are increasingly used for personal reflection and emotional disclosure, raising concerns about their effects on vulnerable users. Recent anecdotal reports suggest that prolonged interactions with AI may reinforce delusional thinking -- a phenomenon sometimes described as AI Psychosis. However, empirical evidence on this phenomenon remains limited. In this work, we examine how delusion-related language evolves during multi-turn interactions with conversational AI. We construct simulated users (SimUsers) from Reddit users' longitudinal posting histories and generate extended conversations with three model families (GPT, LLaMA, and Qwen). We develop DelusionScore, a linguistic measure that quantifies the intensity of delusion-related language across conversational turns. We find that SimUsers derived from users with prior delusion-related discourse (Treatment) exhibit progressively increasing DelusionScore trajectories, whereas those derived from users without such discourse (Control) remain stable or decline. We further find that this amplification varies across themes, with reality skepticism and compulsive reasoning showing the strongest increases. Finally, conditioning AI responses on current DelusionScore substantially reduces these trajectories. These findings provide empirical evidence that conversational AI interactions can amplify delusion-related language over extended use and highlight the importance of state-aware safety mechanisms for mitigating such risks.
Publisher OA PDF
Detecting Early and Implicit Suicidal Ideation via Longitudinal and Information Environment Signals on Social Media
2026-05-20
articleOpen access
On social media, almost 50-60% of individuals experiencing suicidal ideation (SI) do not disclose their distress explicitly [32]. Instead, signs may surface indirectly through everyday posts or peer interactions. Detecting such implicit signals early is critical but remains challenging. We frame early and implicit SI as a forward-looking prediction task and develop a computational framework that models a user’s information environment, consisting of both their longitudinal posting histories as well as the discourse of their socially proximal peers. We adopted a composite network centrality measure to identify top neighbors of a user, and temporally aligned the user’s and neighbors’ interactions—integrating the multi-layered signals in a fine-tuned DeBERTa-v3 model. In a Reddit study of 1,000 (500 Case and 500 Control) users, our approach improves early and implicit SI detection by an average of 10% over all other baselines. These findings highlight that peer interactions offer valuable predictive signals and carry broader implications for designing early detection systems that capture indirect as well as masked expressions of risk in online environments.
Publisher DOI
Living Contracts: Beyond Document-Centric Interaction with Legal Agreements
ArXiv.org · 2026-02-01
articleOpen access
User interaction with legal contracts has been limited to document reading, which is often complicated by complex, ambiguous legal language. We explore possible futures where contract interfaces go beyond single document interfaces to (1) educate users with legal rights not stated in the contract, (2) transform legal language into alternative representations to aid information tasks before, during, and after signing, and (3) proactively supply contractual information at relevant moments. We refer to these future interfaces collectively as Living Contracts. Using residential leases as a case study, we created three design probes representing different possible Living Contracts. A three-part qualitative study (N=18) revealed participants' barriers to interacting with contracts, including interpreting complex language, uncertainty about legal rights, and the pressure to sign quickly. Participants' feedback on the probes highlighted how Living Contracts have the potential to address these challenges and open new design opportunities for human-contract interactions beyond document reading.
Publisher OA PDF
Supporting Learners' Use of Imperfect Generative Pedagogical Chatbots: The Role of Chatbot Response Uncertainty and Reduced Verbosity
2026-04-13 · 1 citations
articleOpen access
Generative chatbots promise to scale personalized learning. Most publicly available generative chatbots are designed to provide confident and eloquent responses by default, even when hallucinating. Prior work has observed that learners using such chatbots often engage shallowly and fail to detect chatbot errors due to overtrust, cognitive overload, and prioritization of short-term gains. To address these challenges, this work examines two chatbot design options in a STEM learning context: introducing verbal uncertainty and reducing response verbosity. Using Bayesian causal inference and thematic analysis in a quasi-experimental setting, we found that a less verbose chatbot improved detection of errors with logical fallacies, but did not increase the use of alternative resources. A chatbot that always expressed uncertainty reduced the adoption of incorrect chatbot responses, but had mixed effects on learning outcomes, suggesting the need to increase signal credibility and maintain learners’ engagement in the learning process despite chatbot disuse.
Publisher DOI
From Plausible to Causal: Counterfactual Semantics for Policy Evaluation in Simulated Online Communities
arXiv (Cornell University) · 2026-04-05
preprintOpen accessSenior author
LLM-based social simulations can generate believable community interactions, enabling ``policy wind tunnels'' where governance interventions are tested before deployment. But believability is not causality. Claims like ``intervention $A$ reduces escalation'' require causal semantics that current simulation work typically does not specify. We propose adopting the causal counterfactual framework, distinguishing \textit{necessary causation} (would the outcome have occurred without the intervention?) from \textit{sufficient causation} (does the intervention reliably produce the outcome?). This distinction maps onto different stakeholder needs: moderators diagnosing incidents require evidence about necessity, while platform designers choosing policies require evidence about sufficiency. We formalize this mapping, show how simulation design can support estimation under explicit assumptions, and argue that the resulting quantities should be interpreted as simulator-conditional causal estimates whose policy relevance depends on simulator fidelity. Establishing this framework now is essential: it helps define what adequate fidelity means and moves the field from simulations that look realistic toward simulations that can support policy changes.
Publisher DOI

Recent grants

Collabortive Research: Design of Dense RFID Systems for Indexing in the Physical World across Space, Time, and Human Experience
NSF · $174k · 2007–2012

Frequent coauthors

Yu‐Ru Lin
27 shared
Yinpeng Chen
25 shared
Thanassis Rikakis
University of Southern California
23 shared
Aisling Kelliher
University of Southern California
20 shared
Adit Krishnan
18 shared
Lexing Xie
16 shared
Munmun De Choudhury
16 shared
Shih‐Fu Chang
15 shared

Labs

Siebel School of Computing and Data SciencePI

Education

Ph.D., Computer Science
University of Illinois at Urbana-Champaign
2000
M.S., Computer Science
University of Illinois at Urbana-Champaign
1996
B.S., Electrical and Electronics Engineering
University of Madras
1992

Awards & honors

Eliahu Jury award for best dissertation (2002)
IBM faculty awards (2007, 2008)
Best-paper awards and best-paper runner-up honors from IEEE…
ACM distinguished scientist (2019)
IEEE senior member (2019)

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Hari Sundaram

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you