Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Hari  Sundaram

Hari Sundaram

· ProfessorVerified

University of Illinois Urbana-Champaign · Computer Science

Active 1997–2026

h-index35
Citations5.5k
Papers24743 last 5y
Funding$174k
See your match with Hari Sundaram — sign in to PhdFit.Sign in

About

Hari Sundaram is a professor at the Siebel School of Computing and Data Science at the University of Illinois Urbana-Champaign. He received his Ph.D. in Electrical Engineering from Columbia University in 2002, under the supervision of Shih-Fu Chang. His academic career includes co-founding the School of Arts, Media and Engineering at Arizona State University, where he served as Associate Director from 2009 to 2012. He joined the University of Illinois in 2014 with a joint appointment between the departments of Computer Science and the Charles H. Sandage Department of Advertising. Sundaram's research spans applied machine learning, network science, and human-computer interaction. He develops algorithms and builds systems that help individuals understand and act, with a focus on constraining system and algorithm design to account for human cognition limits. His major contributions in multimedia computing include a feedback-control algorithm for a real-time mixed-reality system aimed at stroke patient rehabilitation. In network science, he has developed algorithms for identifying homogenous communities within large heterogeneous networks, detecting collective behavior, and efficiently sampling large graphs. His current research is motivated by large-scale collective-action problems such as public health and climate change, where he works on developing algorithms for robust inference of behavior, mechanisms to improve social welfare, message synthesis, and computational infrastructure for large-scale field experiments.

Research topics

  • Sociology
  • Computer Science
  • Computer Security
  • Business
  • Transport engineering
  • Accounting
  • Advertising
  • Environmental health
  • Gerontology
  • Psychology
  • Marketing
  • Psychiatry
  • Medicine
  • Demography
  • Finance
  • Engineering
  • Internet privacy

Selected publications

  • Masking or Mitigating? Deconstructing the Impact of Query Rewriting on Retriever Biases in RAG

    ArXiv.org · 2026-04-07

    articleOpen accessSenior author

    Dense retrievers in retrieval-augmented generation (RAG) systems exhibit systematic biases -- including brevity, position, literal matching, and repetition biases -- that can compromise retrieval quality. Query rewriting techniques are now standard in RAG pipelines, yet their impact on these biases remains unexplored. We present the first systematic study of how query enhancement techniques affect dense retrieval biases, evaluating five methods across six retrievers. Our findings reveal that simple LLM-based rewriting achieves the strongest aggregate bias reduction (54\%), yet fails under adversarial conditions where multiple biases combine. Mechanistic analysis uncovers two distinct mechanisms: simple rewriting reduces bias through increased score variance, while pseudo-document generation methods achieve reduction through genuine decorrelation from bias-inducing features. However, no technique uniformly addresses all biases, and effects vary substantially across retrievers. Our results provide practical guidance for selecting query enhancement strategies based on specific bias vulnerabilities. More broadly, we establish a taxonomy distinguishing query-document interaction biases from document encoding biases, clarifying the limits of query-side interventions for debiasing RAG systems.

  • Living Contracts: Beyond Document-Centric Interaction with Legal Agreements

    Open MIND · 2026-02-01

    preprint

    User interaction with legal contracts has been limited to document reading, which is often complicated by complex, ambiguous legal language. We explore possible futures where contract interfaces go beyond single document interfaces to (1) educate users with legal rights not stated in the contract, (2) transform legal language into alternative representations to aid information tasks before, during, and after signing, and (3) proactively supply contractual information at relevant moments. We refer to these future interfaces collectively as Living Contracts. Using residential leases as a case study, we created three design probes representing different possible Living Contracts. A three-part qualitative study (N=18) revealed participants' barriers to interacting with contracts, including interpreting complex language, uncertainty about legal rights, and the pressure to sign quickly. Participants' feedback on the probes highlighted how Living Contracts have the potential to address these challenges and open new design opportunities for human-contract interactions beyond document reading.

  • Masking or Mitigating? Deconstructing the Impact of Query Rewriting on Retriever Biases in RAG

    arXiv (Cornell University) · 2026-04-07

    preprintOpen accessSenior author

    Dense retrievers in retrieval-augmented generation (RAG) systems exhibit systematic biases -- including brevity, position, literal matching, and repetition biases -- that can compromise retrieval quality. Query rewriting techniques are now standard in RAG pipelines, yet their impact on these biases remains unexplored. We present the first systematic study of how query enhancement techniques affect dense retrieval biases, evaluating five methods across six retrievers. Our findings reveal that simple LLM-based rewriting achieves the strongest aggregate bias reduction (54\%), yet fails under adversarial conditions where multiple biases combine. Mechanistic analysis uncovers two distinct mechanisms: simple rewriting reduces bias through increased score variance, while pseudo-document generation methods achieve reduction through genuine decorrelation from bias-inducing features. However, no technique uniformly addresses all biases, and effects vary substantially across retrievers. Our results provide practical guidance for selecting query enhancement strategies based on specific bias vulnerabilities. More broadly, we establish a taxonomy distinguishing query-document interaction biases from document encoding biases, clarifying the limits of query-side interventions for debiasing RAG systems.

  • AI Psychosis: Does Conversational AI Amplify Delusion-Related Language?

    arXiv (Cornell University) · 2026-03-20

    preprintOpen access

    Conversational AI systems are increasingly used for personal reflection and emotional disclosure, raising concerns about their effects on vulnerable users. Recent anecdotal reports suggest that prolonged interactions with AI may reinforce delusional thinking -- a phenomenon sometimes described as AI Psychosis. However, empirical evidence on this phenomenon remains limited. In this work, we examine how delusion-related language evolves during multi-turn interactions with conversational AI. We construct simulated users (SimUsers) from Reddit users' longitudinal posting histories and generate extended conversations with three model families (GPT, LLaMA, and Qwen). We develop DelusionScore, a linguistic measure that quantifies the intensity of delusion-related language across conversational turns. We find that SimUsers derived from users with prior delusion-related discourse (Treatment) exhibit progressively increasing DelusionScore trajectories, whereas those derived from users without such discourse (Control) remain stable or decline. We further find that this amplification varies across themes, with reality skepticism and compulsive reasoning showing the strongest increases. Finally, conditioning AI responses on current DelusionScore substantially reduces these trajectories. These findings provide empirical evidence that conversational AI interactions can amplify delusion-related language over extended use and highlight the importance of state-aware safety mechanisms for mitigating such risks.

  • Living Contracts: Beyond Document-Centric Interaction with Legal Agreements

    2026-04-13 · 1 citations

    articleOpen access

    User interaction with legal contracts has been limited to document reading, which is often complicated by complex, ambiguous legal language. We explore possible futures where contract interfaces go beyond single document interfaces to (1) educate users with legal rights not stated in the contract, (2) transform legal language into alternative representations to aid information tasks before, during, and after signing, and (3) proactively supply contractual information at relevant moments. We refer to these future interfaces collectively as Living Contracts. Using residential leases as a case study, we created three design probes representing different possible Living Contracts. A three-part qualitative study (N=18) revealed participants’ barriers to interacting with contracts, including interpreting complex language, uncertainty about legal rights, and the pressure to sign quickly. Participants’ feedback on the probes highlighted how Living Contracts have the potential to address these challenges and open new design opportunities for human-contract interactions beyond document reading.

  • AI Psychosis: Does Conversational AI Amplify Delusion-Related Language?

    ArXiv.org · 2026-03-20

    articleOpen access

    Conversational AI systems are increasingly used for personal reflection and emotional disclosure, raising concerns about their effects on vulnerable users. Recent anecdotal reports suggest that prolonged interactions with AI may reinforce delusional thinking -- a phenomenon sometimes described as AI Psychosis. However, empirical evidence on this phenomenon remains limited. In this work, we examine how delusion-related language evolves during multi-turn interactions with conversational AI. We construct simulated users (SimUsers) from Reddit users' longitudinal posting histories and generate extended conversations with three model families (GPT, LLaMA, and Qwen). We develop DelusionScore, a linguistic measure that quantifies the intensity of delusion-related language across conversational turns. We find that SimUsers derived from users with prior delusion-related discourse (Treatment) exhibit progressively increasing DelusionScore trajectories, whereas those derived from users without such discourse (Control) remain stable or decline. We further find that this amplification varies across themes, with reality skepticism and compulsive reasoning showing the strongest increases. Finally, conditioning AI responses on current DelusionScore substantially reduces these trajectories. These findings provide empirical evidence that conversational AI interactions can amplify delusion-related language over extended use and highlight the importance of state-aware safety mechanisms for mitigating such risks.

  • Detecting Early and Implicit Suicidal Ideation via Longitudinal and Information Environment Signals on Social Media

    2026-05-20

    articleOpen access

    On social media, almost 50-60% of individuals experiencing suicidal ideation (SI) do not disclose their distress explicitly [32]. Instead, signs may surface indirectly through everyday posts or peer interactions. Detecting such implicit signals early is critical but remains challenging. We frame early and implicit SI as a forward-looking prediction task and develop a computational framework that models a user’s information environment, consisting of both their longitudinal posting histories as well as the discourse of their socially proximal peers. We adopted a composite network centrality measure to identify top neighbors of a user, and temporally aligned the user’s and neighbors’ interactions—integrating the multi-layered signals in a fine-tuned DeBERTa-v3 model. In a Reddit study of 1,000 (500 Case and 500 Control) users, our approach improves early and implicit SI detection by an average of 10% over all other baselines. These findings highlight that peer interactions offer valuable predictive signals and carry broader implications for designing early detection systems that capture indirect as well as masked expressions of risk in online environments.

  • Living Contracts: Beyond Document-Centric Interaction with Legal Agreements

    ArXiv.org · 2026-02-01

    articleOpen access

    User interaction with legal contracts has been limited to document reading, which is often complicated by complex, ambiguous legal language. We explore possible futures where contract interfaces go beyond single document interfaces to (1) educate users with legal rights not stated in the contract, (2) transform legal language into alternative representations to aid information tasks before, during, and after signing, and (3) proactively supply contractual information at relevant moments. We refer to these future interfaces collectively as Living Contracts. Using residential leases as a case study, we created three design probes representing different possible Living Contracts. A three-part qualitative study (N=18) revealed participants' barriers to interacting with contracts, including interpreting complex language, uncertainty about legal rights, and the pressure to sign quickly. Participants' feedback on the probes highlighted how Living Contracts have the potential to address these challenges and open new design opportunities for human-contract interactions beyond document reading.

  • Supporting Learners' Use of Imperfect Generative Pedagogical Chatbots: The Role of Chatbot Response Uncertainty and Reduced Verbosity

    2026-04-13 · 1 citations

    articleOpen access

    Generative chatbots promise to scale personalized learning. Most publicly available generative chatbots are designed to provide confident and eloquent responses by default, even when hallucinating. Prior work has observed that learners using such chatbots often engage shallowly and fail to detect chatbot errors due to overtrust, cognitive overload, and prioritization of short-term gains. To address these challenges, this work examines two chatbot design options in a STEM learning context: introducing verbal uncertainty and reducing response verbosity. Using Bayesian causal inference and thematic analysis in a quasi-experimental setting, we found that a less verbose chatbot improved detection of errors with logical fallacies, but did not increase the use of alternative resources. A chatbot that always expressed uncertainty reduced the adoption of incorrect chatbot responses, but had mixed effects on learning outcomes, suggesting the need to increase signal credibility and maintain learners’ engagement in the learning process despite chatbot disuse.

  • From Plausible to Causal: Counterfactual Semantics for Policy Evaluation in Simulated Online Communities

    arXiv (Cornell University) · 2026-04-05

    preprintOpen accessSenior author

    LLM-based social simulations can generate believable community interactions, enabling ``policy wind tunnels'' where governance interventions are tested before deployment. But believability is not causality. Claims like ``intervention $A$ reduces escalation'' require causal semantics that current simulation work typically does not specify. We propose adopting the causal counterfactual framework, distinguishing \textit{necessary causation} (would the outcome have occurred without the intervention?) from \textit{sufficient causation} (does the intervention reliably produce the outcome?). This distinction maps onto different stakeholder needs: moderators diagnosing incidents require evidence about necessity, while platform designers choosing policies require evidence about sufficiency. We formalize this mapping, show how simulation design can support estimation under explicit assumptions, and argue that the resulting quantities should be interpreted as simulator-conditional causal estimates whose policy relevance depends on simulator fidelity. Establishing this framework now is essential: it helps define what adequate fidelity means and moves the field from simulations that look realistic toward simulations that can support policy changes.

Recent grants

Frequent coauthors

Labs

  • Siebel School of Computing and Data SciencePI

Education

  • Ph.D., Computer Science

    University of Illinois at Urbana-Champaign

    2000
  • M.S., Computer Science

    University of Illinois at Urbana-Champaign

    1996
  • B.S., Electrical and Electronics Engineering

    University of Madras

    1992

Awards & honors

  • Eliahu Jury award for best dissertation (2002)
  • IBM faculty awards (2007, 2008)
  • Best-paper awards and best-paper runner-up honors from IEEE…
  • ACM distinguished scientist (2019)
  • IEEE senior member (2019)
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Hari Sundaram

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup