Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Abdullah Almaatouq

Abdullah Almaatouq

· Douglas Drane Career Development Associate Professor in Information Technology and ManagementVerified

Massachusetts Institute of Technology · Information Technology

Active 2013–2026

h-index13
Citations937
Papers5637 last 5y
Funding
See your match with Abdullah Almaatouq — sign in to PhdFit.Sign in

About

Abdullah Almaatouq is the Douglas Drane Career Development Associate Professor in Information Technology and Management at the MIT Sloan School of Management. He is a computational social scientist whose research focuses on improving cooperation, coordination, and collective intelligence in decision-making systems such as teams, committees, crowds, markets, and elections. Abdullah explores ways to advance social and behavioral research methodology through innovative research designs and theory-building strategies, with the goal of developing a deeper understanding of collective decision systems and how to design them effectively in various contexts. He is affiliated with the MIT Center for Computational Engineering, the MIT Center for Collective Intelligence, and the MIT Connection Science Research Initiative. Abdullah holds a PhD in computational science and engineering, along with dual master's degrees in media arts and sciences (MIT Media Lab) and computational science and engineering from MIT. Prior to joining MIT, he earned his undergraduate degree from Southampton University in the United Kingdom.

Research signals

Five dimensions sourced from public faculty / publication signals. Sign in to compare against your own profile and see your match score.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Machine Learning
  • Psychology
  • Social psychology
  • Statistics
  • Mathematics
  • Engineering
  • Medicine
  • Theoretical computer science
  • Cognitive psychology
  • Communication
  • Algorithm
  • Geography

Selected publications

  • Integrative experiments identify how punishment affects welfare in public goods games

    Science · 2026-04-09 · 1 citations

    articleSenior authorCorresponding

    Despite decades of research, the conditions under which punishment promotes cooperation remain unclear. Through an integrative experiment varying 14 design parameters of public goods games across 360 experimental conditions (147,618 decisions from 7100 participants), we reveal substantial heterogeneity in punishment effectiveness: Its impact on welfare ranges from 43% improvement to 44% reduction depending on the game parameters. To characterize these patterns, we developed models that outperformed human forecasters in predicting punishment effectiveness in new experiments. Communication emerges as the most important factor, followed by contribution framing (opt out versus opt in), contribution type (variable versus all-or-nothing), game length, and outcome visibility, though these factors often interact. The results reframe the debate from whether punishment works to when it does, demonstrating how integrative experiments enable discovery of generalizable patterns in social phenomena.

  • Post-training makes large language models less human-like

    arXiv (Cornell University) · 2026-05-08

    preprintOpen access

    Large language models (LLMs) are increasingly used as surrogates for human participants, but it remains unclear which models best capture human behavior and why. To address this, we introduce Psych-201, a novel dataset that enables us to measure behavioral alignment at scale. We find that post-training -- the stage that turns base models into useful assistants -- consistently reduces alignment with human behavior across model families, sizes, and objectives. Moreover, this misalignment widens in newer model generations even as base models continue to improve. Finally, we find that persona-induction -- a popular technique for eliciting human-like behavior by conditioning models on participant-specific information -- does not improve predictions at the level of individuals. Taken together, our results suggest that the very processes that are currently employed to turn LLMs into useful assistants also make them less accurate models of human behavior.

  • Evaluating Human-AI Safety: A Framework for Measuring Harmful Capability Uplift

    arXiv (Cornell University) · 2026-03-06

    preprintOpen access

    Current frontier AI safety evaluations emphasize static benchmarks, third-party annotations, and red-teaming. In this position paper, we argue that AI safety research should focus on human-centered evaluations that measure harmful capability uplift: the marginal increase in a user's ability to cause harm with a frontier model beyond what conventional tools already enable. We frame harmful capability uplift as a core AI safety metric, ground it in prior social science research, and provide concrete methodological guidance for systematic measurement. We conclude with actionable steps for developers, researchers, funders, and regulators to make harmful capability uplift evaluation a standard practice.

  • Post-training makes large language models less human-like

    ArXiv.org · 2026-05-08

    articleOpen access

    Large language models (LLMs) are increasingly used as surrogates for human participants, but it remains unclear which models best capture human behavior and why. To address this, we introduce Psych-201, a novel dataset that enables us to measure behavioral alignment at scale. We find that post-training -- the stage that turns base models into useful assistants -- consistently reduces alignment with human behavior across model families, sizes, and objectives. Moreover, this misalignment widens in newer model generations even as base models continue to improve. Finally, we find that persona-induction -- a popular technique for eliciting human-like behavior by conditioning models on participant-specific information -- does not improve predictions at the level of individuals. Taken together, our results suggest that the very processes that are currently employed to turn LLMs into useful assistants also make them less accurate models of human behavior.

  • Evaluating Human-AI Safety: A Framework for Measuring Harmful Capability Uplift

    arXiv (Cornell University) · 2026-03-06

    articleOpen access

    Current frontier AI safety evaluations emphasize static benchmarks, third-party annotations, and red-teaming. In this position paper, we argue that AI safety research should focus on human-centered evaluations that measure harmful capability uplift: the marginal increase in a user's ability to cause harm with a frontier model beyond what conventional tools already enable. We frame harmful capability uplift as a core AI safety metric, ground it in prior social science research, and provide concrete methodological guidance for systematic measurement. We conclude with actionable steps for developers, researchers, funders, and regulators to make harmful capability uplift evaluation a standard practice.

  • The Integration of Explanation and Prediction in Behavioral Science

    Current Directions in Psychological Science · 2026-04-10

    article1st authorCorresponding

    Behavioral scientists aim to explain and predict behavior. In principle, these goals align; in practice, common approaches to pursuing them have become distinct traditions in tension with one another. The explanatory tradition often examines causal factors in isolation, establishing that they have some effect but not how much or how they combine. The predictive tradition learns how factors combine, but these patterns may not reflect a causal structure or hold when conditions change. Answering how much each factor matters, and how they combine across settings, requires both predictive accuracy and causal interpretation. This article examines three developments toward this integration: evaluation frameworks that emphasize generalization, systematic experimentation and flexible models, and interpretation tools. We present recent empirical examples that demonstrate how this integration enables the discovery of generalizable patterns and provides a path toward cumulative behavioral science.

  • The Task Space: An Integrative Framework for Team Research

    2025-12-01

    preprintOpen accessSenior author

    Research on teams spans many contexts, but integrating knowledge from heterogeneous sources is challenging because studies typically examine different tasks that cannot be directly compared. Most investigations involve teams working on just one or a handful of tasks, and researchers lack principled ways to quantify how similar or different these tasks are from one another. We address this challenge by introducing the “Task Space,” a multidimensional space in which tasks—and the distances between them—can be represented formally, and use it to create a “Task Map” of 102 crowd-annotated tasks from the published experimental literature. We then demonstrate the Task Space’s utility by performing an integrative experiment that addresses a fundamental question in team research: when do interacting groups outperform individuals? Our experiment samples 20 diverse tasks from the Task Map at three complexity levels and recruits 1,231 participants to work either individually or in groups of three or six (180 experimental conditions). We find striking heterogeneity in group advantage, with groups performing anywhere from three times worse to 60% better than the best individual working alone, depending on the task context. Critically, the Task Space makes this heterogeneity predictable: it significantly outperforms traditional typologies in predicting group advantage on unseen tasks. Our models also reveal theoretically meaningful interactions between task features; for example, group advantage on creative tasks depends on whether the answers are objectively verifiable. We conclude by arguing that the Task Space enables researchers to integrate findings across different experiments, thereby building cumulative knowledge about team performance.

  • Integrative Experiments Identify How Punishment Affects Welfare in Public Goods Games

    Open MIND · 2025-01-01

    otherSenior author

    Reproducibility package for "Integrative Experiments Identify How Punishment Affects Welfare in Public Goods Games" (https://www.science.org/doi/10.1126/science.aeb5280)

  • The Task Space: An Integrative Framework for Team Research

    PsyArXiv (OSF Preprints) · 2025-10-14

    otherOpen access

    Research on teams spans many contexts, but integrating knowledge from heterogeneous sources is challenging because studies typically examine different tasks that cannot be directly compared. Most investigations involve teams working on just one or a handful of tasks, and researchers lack principled ways to quantify how similar or different these tasks are from one another. We address this challenge by introducing the “Task Space,” a multidimensional space in which tasks—and the distances between them—can be represented formally, and use it to create a “Task Map” of 102 crowd-annotated tasks from the published experimental literature. We then demonstrate the Task Space’s utility by performing an integrative experiment that addresses a fundamental question in team research: when do interacting groups outperform individuals? Our experiment samples 20 diverse tasks from the Task Map at three complexity levels and recruits 1,231 participants to work either individually or in groups of three or six (180 experimental conditions). We find striking heterogeneity in group advantage, with groups performing anywhere from three times worse to 60% better than the best individual working alone, depending on the task context. Critically, the Task Space makes this heterogeneity predictable: it significantly outperforms traditional typologies in predicting group advantage on unseen tasks. Our models also reveal theoretically meaningful interactions between task features; for example, group advantage on creative tasks depends on whether the answers are objectively verifiable. We conclude by arguing that the Task Space enables researchers to integrate findings across different experiments, thereby building cumulative knowledge about team performance.

  • Studying collective intelligence in the lab

    Edward Elgar Publishing Limited eBooks · 2025-12-11

    book-chapter1st authorCorresponding

Frequent coauthors

Labs

Education

  • Masters of Science, Media Lab

    Massachusetts Institute of Technology

  • Masters of Science, Center for Computational Engineering

    Massachusetts Institute of Technology

  • Computational Science & Engineering, Computational Engineering

    Massachusetts Institute of Technology

    2019
  • Bachelor of Science, Electronics and Computer Science

    University of Southampton

    2012
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Abdullah Almaatouq

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup