Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Emilio Ferrara

Emilio Ferrara

· Professor of Computer Science and CommunicationVerified

University of Southern California · Thomas Lord Department of Computer Science

Active 1981–2026

h-index71
Citations24.0k
Papers477245 last 5y
Funding
See your match with Emilio Ferrara — sign in to PhdFit.Sign in

About

Emilio Ferrara is a Professor of Computer Science at the Thomas Lord Department of Computer Science (USC Viterbi), Communication (USC Annenberg), and Preventive Medicine (Keck School of Medicine of USC). He’s also a Principal Scientist at the University of Southern California’s Information Sciences Institute, and Principal Investigator of the Machine Intelligence and Data Science (MINDS) group. The HUMANS LAB’s research core is Humans & Machines + Networks & Social Systems.

Research topics

  • Sociology
  • Political Science
  • Art
  • Law
  • Art history

Selected publications

  • Stop Drawing Scientific Claims from LLM Social Simulations Without Robustness Audits

    ArXiv.org · 2026-05-17

    articleOpen accessSenior author

    The scientific claims drawn from LLM social simulations should be no stronger than the robustness audits that support them. Generative agents bring new expressive power to agent-based modeling, enabling simulations of collective social processes like cooperation, polarization, and norm formation. Yet they also introduce complexity through additional architectural choices, such as agent specification, memory representation, interaction protocols, and environment design. Small perturbations that appear minor to researchers can cascade into macro-level outcomes through repeated interaction, creating a "butterfly effect." Consequently, scientific claims drawn from LLM social simulations may reflect implementation artifacts rather than the social mechanisms being modeled. We support this position with two case studies: a repeated Prisoner's Dilemma and a social media echo chamber simulation. Across multiple models, minor perturbations in persona format and game-instruction framing shift cooperation rates by up to 76 percentage points, while network homophily and hub assignment produce significant and consistent shifts in polarization metrics. We also find that sensitivity is unevenly distributed across both architectural choices and model families: the same perturbation that produces the 76 pp shift in one frontier model only shifts another by 1 pp. Robustness is therefore a property that should be measured per claim and per model, not assumed. To address this validation gap, we introduce TRAILS (Taxonomy for Robustness Audits In LLM Simulations), a robustness-audit taxonomy spanning three levels of simulation design: agent (micro-level), interaction (meso-level), and system (macro-level). We call for robustness to become a first-order validation requirement before LLM social simulations are used to explain mechanisms, evaluate interventions, or inform decisions.

  • Change is Hard: Consistent Player Behavior Across Games with Conflicting Incentives

    ArXiv.org · 2026-03-17

    articleOpen accessSenior author

    This paper examines how player flexibility -- a player's willingness to engage in a breadth of options or specialize -- manifests across two gaming environments: League of Legends (League) and Teamfight Tactics (TFT). We analyze the gameplay decisions of 4,830 players who have played at least 50 competitive games in both titles and explore cross-game dynamics of behavior retention and consistency. Our work introduces a novel cross-game analysis that tracks the same players' behavior across two different environments, reducing self-selection bias. Our findings reveal that while games incentivize different behaviors (specialization in League versus flexibility in TFT) for performance-based success, players exhibit consistent behavior across platforms. This study contributes to long-standing debate about agency versus structure, showing individual agency may be more predictive of cross-platform behavior than game-imposed structure in competitive settings. These insights offer implications for game developers, designers and researchers interested in building systems to promote behavior change.

  • Change is Hard: Consistent Player Behavior Across Games with Conflicting Incentives

    2026-04-13 · 1 citations

    articleOpen accessSenior author

    This paper examines how player flexibility – a player’s willingness to engage in a breadth of options or specialize – manifests across two gaming environments: League of Legends (League) and Teamfight Tactics (TFT). We analyze the gameplay decisions of 4,830 players who have played at least 50 competitive games in both titles and explore cross-game dynamics of behavior retention and consistency. Our work introduces a novel cross-game analysis that tracks the same players’ behavior across two different environments, reducing self-selection bias. Our findings reveal that while games incentivize different behaviors (specialization in League versus flexibility in TFT) for performance-based success, players exhibit consistent behavior across platforms. This study contributes to long-standing debate about agency versus structure, showing individual agency may be more predictive of cross-platform behavior than game-imposed structure in competitive settings. These insights offer implications for game developers, designers and researchers interested in building systems to promote behavior change.

  • Measuring Human Preferences in RLHF is a Social Science Problem

    arXiv (Cornell University) · 2026-01-31

    articleOpen accessSenior author

    RLHF assumes that annotation responses reflect genuine human preferences. We argue this assumption warrants systematic examination, and that behavioral science offers frameworks that bring clarity to when it holds and when it breaks down. Behavioral scientists have documented for sixty years that people routinely produce responses without holding genuine opinions, construct preferences on the spot based on contextual cues, and interpret identical questions differently. These phenomena are pervasive for precisely the value-laden judgments that matter most for alignment, yet this literature has not yet been systematically integrated into ML practice. We argue that the ML community must treat measurement validity as logically prior to preference aggregation. Specifically, we contend that measuring human preferences in RLHF is a social science problem. We present a taxonomy distinguishing genuine preferences from non-attitudes, constructed preferences, and measurement artifacts, along with diagnostic approaches for detecting each. This framework has two important implications. First, it raises the question of whether current RLHF practice may be systematically modeling noise as signal and elicitation artifacts as human values. Second, it provides a path forward by suggesting diagnostic tools that can distinguish valid preferences from artifacts before they enter the training pipeline.

  • Lost Before Translation: Social Information Transmission and Survival in AI-AI Communication

    Open MIND · 2026-01-01

    articleSenior author

    When AI systems summarize and relay information, they inevitably transform it. But how? We introduce an experimental paradigm based on the telephone game to study what happens when AI talks to AI. Across five studies tracking content through AI transmission chains, we find three consistent patterns. The first is convergence, where texts differing in certainty, emotional intensity, and perspectival balance collapse toward a shared default of moderate confidence, muted affect, and analytical structure. The second is selective survival, where narrative anchors persist while the texture of evidence, hedges, quotes, and attributions is stripped away. The third is competitive filtering, where strong arguments survive while weaker but valid considerations disappear when multiple viewpoints coexist. In downstream experiments, human participants rated AI-transmitted content as more credible and polished. Importantly, however, humans also showed degraded factual recall, reduced perception of balance, and diminished emotional resonance. We show that the properties that make AI-mediated content appear authoritative may systematically erode the cognitive and affective diversity on which informed judgment depends.

  • Stop Drawing Scientific Claims from LLM Social Simulations Without Robustness Audits

    arXiv (Cornell University) · 2026-05-17

    preprintOpen accessSenior author

    The scientific claims drawn from LLM social simulations should be no stronger than the robustness audits that support them. Generative agents bring new expressive power to agent-based modeling, enabling simulations of collective social processes like cooperation, polarization, and norm formation. Yet they also introduce complexity through additional architectural choices, such as agent specification, memory representation, interaction protocols, and environment design. Small perturbations that appear minor to researchers can cascade into macro-level outcomes through repeated interaction, creating a "butterfly effect." Consequently, scientific claims drawn from LLM social simulations may reflect implementation artifacts rather than the social mechanisms being modeled. We support this position with two case studies: a repeated Prisoner's Dilemma and a social media echo chamber simulation. Across multiple models, minor perturbations in persona format and game-instruction framing shift cooperation rates by up to 76 percentage points, while network homophily and hub assignment produce significant and consistent shifts in polarization metrics. We also find that sensitivity is unevenly distributed across both architectural choices and model families: the same perturbation that produces the 76 pp shift in one frontier model only shifts another by 1 pp. Robustness is therefore a property that should be measured per claim and per model, not assumed. To address this validation gap, we introduce TRAILS (Taxonomy for Robustness Audits In LLM Simulations), a robustness-audit taxonomy spanning three levels of simulation design: agent (micro-level), interaction (meso-level), and system (macro-level). We call for robustness to become a first-order validation requirement before LLM social simulations are used to explain mechanisms, evaluate interventions, or inform decisions.

  • Change is Hard: Consistent Player Behavior Across Games with Conflicting Incentives

    arXiv (Cornell University) · 2026-03-17

    preprintOpen accessSenior author

    This paper examines how player flexibility -- a player's willingness to engage in a breadth of options or specialize -- manifests across two gaming environments: League of Legends (League) and Teamfight Tactics (TFT). We analyze the gameplay decisions of 4,830 players who have played at least 50 competitive games in both titles and explore cross-game dynamics of behavior retention and consistency. Our work introduces a novel cross-game analysis that tracks the same players' behavior across two different environments, reducing self-selection bias. Our findings reveal that while games incentivize different behaviors (specialization in League versus flexibility in TFT) for performance-based success, players exhibit consistent behavior across platforms. This study contributes to long-standing debate about agency versus structure, showing individual agency may be more predictive of cross-platform behavior than game-imposed structure in competitive settings. These insights offer implications for game developers, designers and researchers interested in building systems to promote behavior change.

  • Measuring Human Preferences in RLHF is a Social Science Problem

    arXiv (Cornell University) · 2026-01-31

    preprintOpen accessSenior author

    RLHF assumes that annotation responses reflect genuine human preferences. We argue this assumption warrants systematic examination, and that behavioral science offers frameworks that bring clarity to when it holds and when it breaks down. Behavioral scientists have documented for sixty years that people routinely produce responses without holding genuine opinions, construct preferences on the spot based on contextual cues, and interpret identical questions differently. These phenomena are pervasive for precisely the value-laden judgments that matter most for alignment, yet this literature has not yet been systematically integrated into ML practice. We argue that the ML community must treat measurement validity as logically prior to preference aggregation. Specifically, we contend that measuring human preferences in RLHF is a social science problem. We present a taxonomy distinguishing genuine preferences from non-attitudes, constructed preferences, and measurement artifacts, along with diagnostic approaches for detecting each. This framework has two important implications. First, it raises the question of whether current RLHF practice may be systematically modeling noise as signal and elicitation artifacts as human values. Second, it provides a path forward by suggesting diagnostic tools that can distinguish valid preferences from artifacts before they enter the training pipeline.

  • No Free Lunch in Language Model Bias Mitigation? Targeted Bias Reduction Can Exacerbate Unmitigated LLM Biases

    AI · 2026-01-13

    articleOpen accessSenior authorCorresponding

    Large Language Models (LLMs) inherit societal biases from their training data, potentially leading to harmful outputs. While various techniques aim to mitigate these biases, their effects are typically evaluated only along the targeted dimension, leaving cross-dimensional consequences unexplored. This work provides the first systematic quantification of cross-category spillover effects in LLM bias mitigation. We evaluate four bias mitigation techniques (Logit Steering, Activation Patching, BiasEdit, Prompt Debiasing) across ten models from seven families, measuring impact on racial, religious, profession-, and gender-related biases using the StereoSet benchmark. Across 160 experiments yielding 640 evaluations, we find that targeted interventions cause collateral degradations to model coherence and performance along debiasing objectives in 31.5% of untargeted dimension evaluations. These findings provide empirical evidence that debiasing improvements along one dimension can come at the cost of degradation in others. We introduce a multi-dimensional auditing framework and demonstrate that single-target evaluations mask potentially severe spillover effects, underscoring the need for robust, multi-dimensional evaluation tools when examining and developing bias mitigation strategies to avoid inadvertently shifting or worsening bias along untargeted axes.

  • Emergent Coordinated Behaviors in Networked LLM Agents: Modeling the Strategic Dynamics of Information Operations

    2026-04-09 · 1 citations

    articleOpen access

    Generative agents are rapidly advancing in sophistication, raising urgent questions about how they might coordinate when deployed in online ecosystems. This is particularly consequential in information operations (IOs), influence campaigns that aim to manipulate public opinion on social media. While traditional IOs have been orchestrated by human operators and relied on manually crafted tactics, agentic AI promises to make campaigns more automated, adaptive, and difficult to detect. This work presents the first systematic study of emergent coordination among generative agents in simulated IO campaigns. Using generative agent-based modeling, we instantiate IO and organic agents in a simulated environment and evaluate coordination across operational regimes, from simple goal alignment to team knowledge and collective decision-making. As operational regimes become more structured, IO networks become denser and more clustered, interactions more reciprocal and positive, narratives more homogeneous, amplification more synchronized, and hashtag adoption faster and more sustained. Remarkably, simply revealing to agents which other agents share their goals can produce coordination levels nearly equivalent to those achieved through explicit deliberation and collective voting. Overall, we show that generative agents, even without human guidance, can reproduce coordination strategies characteristic of real-world IOs, underscoring the societal risks posed by increasingly automated, self-organizing IOs.

Frequent coauthors

  • Kristina Lerman

    1493 shared
  • Emily Chen

    1418 shared
  • Luca Luceri

    76 shared
  • Alessandro Flammini

    46 shared
  • Filippo Menczer

    Indiana University

    42 shared
  • Julie Jiang

    University of Southern California

    33 shared
  • Anna Sapienza

    University of Copenhagen

    33 shared
  • Goran Murić

    31 shared

Labs

Education

  • Ph.D., Computer Science

    University of Southern California

  • M.S., Computer Science

    University of Southern California

  • B.S., Computer Science

    University of Southern California

Awards & honors

  • 2015 IBM Watson Big Data Influencer
  • 2016 Complex Systems Society Junior Scientific Award
  • 2016 DARPA Young Faculty Award
  • 2018 DARPA Director's Fellowship
  • 2019 Viterbi Research Award
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Emilio Ferrara

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup