Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Samuel J. Gershman

Samuel J. Gershman

· Professor of PsychologyVerified

Harvard University · Human Development and Psychology

Active 2009–2026

h-index82
Citations26.2k
Papers493245 last 5y
Funding$39.6M
See your match with Samuel J. Gershman — sign in to PhdFit.Sign in

About

Samuel J. Gershman is a Professor of Psychology at Harvard University, based in the Department of Psychology and the Northwest Laboratory located at 52 Oxford Street, Cambridge, MA. His research aims to understand how richly structured knowledge about the environment is acquired and how this knowledge aids adaptive behavior. The Gershman lab employs a combination of behavioral, neuroimaging, and computational techniques to explore these questions. Gershman received his B.A. in Neuroscience and Behavior from Columbia University in 2007 and his Ph.D. in Psychology and Neuroscience from Princeton University in 2013. Following his doctoral studies, he was a postdoctoral fellow in the Department of Brain and Cognitive Sciences at MIT from 2013 to 2015. His research interests include learning, memory, decision making, and computational neuroscience.

Research topics

  • Computer Science
  • Psychology
  • Cognitive psychology
  • Neuroscience
  • Machine Learning
  • Artificial Intelligence
  • Cognitive science
  • Biology
  • Economics
  • Evolutionary biology
  • Computational biology
  • Social psychology
  • Engineering
  • Microeconomics
  • Management
  • Mathematics
  • Algorithm
  • Management science
  • Econometrics

Selected publications

  • Trading places: What happens when neuroscience turns into machine learning, and machine learning turns into neuroscience?

    The Transmitter · 2026-01-01

    article1st authorCorresponding
  • Artificial intelligence for science: The easy and hard problems

    Philosophical Transactions of the Royal Society A Mathematical Physical and Engineering Sciences · 2026-05-14 · 2 citations

    preprintOpen accessSenior author

    A suite of impressive scientific discoveries has been driven by recent advances in artificial intelligence. These almost all result from training flexible algorithms to solve difficult optimization problems specified in advance by teams of domain scientists and engineers with access to large amounts of data. Although extremely useful, this kind of problem solving only corresponds to one part of science-the 'easy problem'. The other part of scientific research is coming up with the problem itself-the 'hard problem'. Solving the hard problem is beyond the capacities of current algorithms for scientific discovery because it requires continual conceptual revision based on poorly defined constraints. We can make progress on understanding how humans solve the hard problem by studying the cognitive science of scientists and then use the results to design new computational agents that automatically infer and update their scientific paradigms. This article is part of the theme issue 'World models in natural and artificial intelligence'.

  • Entorhinal Cortex Signals Dimensions of Past Experience That Can Be Generalized in a Novel Environment

    Journal of Neuroscience · 2026-01-20

    articleOpen access

    No two situations are identical. They can be similar in some aspects but different in others. This poses a key challenge when attempting to generalize our experience from one situation to another. How do we distinguish the aspects that transfer across situations from those that do not? One hypothesis is that the entorhinal cortex (EC) meets this challenge by forming factorized representations that allow for increased neural similarity between events that share generalizable features. We tested this hypothesis using functional magnetic resonance imaging. Female and male participants ( n = 40) were trained to report behavioral sequences based on an underlying graph structure. Participants then made decisions in a new environment where some but not all graph transitions from the previous structure could be generalized. Behavioral results showed that participants distinguished the generalizable transition information. Accuracy was significantly higher in blocks where sequence transitions were shared across environments than those in which transitions differed. This boost in accuracy was especially pronounced during early exposure to the novel environment. Throughout this early phase, neural patterns in EC showed a corresponding differentiation of the generalizable aspects. Neural patterns representing starting locations in familiar and novel environments were significantly more similar in EC on trials where sequences could be generalized from prior experience, compared to trials with new sequential transitions. This signaling was associated with improved performance when prior sequence knowledge could be reused. Our results suggest that during early exposure to novel environments, EC may signal dimensions of past experience that can be generalized.

  • Humans neglect complexity in predictive model selection

    PsyArXiv (OSF Preprints) · 2026-01-16

    preprintOpen access1st authorCorresponding

    People often face competing predictive models, such as different weather forecasts or music recommendation systems. How do they evaluate which model is better? Past research suggests that people follow Occam's razor, balancing fit and simplicity, but little is known about whether the same principle describes how people select predictive models. In a series of experiments, we gave participants choices between predictive models, allowing them to see the underlying data used to fit the models. Participants systematically neglected model complexity relative to the statistically optimal benchmark, often preferring models that overfit the data. While they partially compensated for complexity neglect by changing their decision thresholds, this strategy failed to appropriately account for the fact that simpler, misspecified models frequently outperform more complex models under noise and limited data. These findings challenge the view that simplicity is a general cognitive preference. When it comes to prediction, people appear to prefer a good fit.

  • Humans neglect complexity in predictive model selection

    2026-01-17

    articleOpen accessSenior author

    People often face competing predictive models, such as different weather forecasts or music recommendation systems. How do they evaluate which model is better? Past research suggests that people follow Occam's razor, balancing fit and simplicity, but little is known about whether the same principle describes how people select predictive models. In a series of experiments, we gave participants choices between predictive models, allowing them to see the underlying data used to fit the models. Participants systematically neglected model complexity relative to the statistically optimal benchmark, often preferring models that overfit the data. While they partially compensated for complexity neglect by changing their decision thresholds, this strategy failed to appropriately account for the fact that simpler, misspecified models frequently outperform more complex models under noise and limited data. These findings challenge the view that simplicity is a general cognitive preference. When it comes to prediction, people appear to prefer a good fit.

  • Training collaborators for effective division of labor

    2025-12-07

    article

    By dividing labor, collaboration brings together the complementary strengths of individuals. Importantly, competence is not static; it develops with training, and hence the optimal division of labor is also dynamic. Choosing the training protocol that achieves this optimum is non-trivial, requiring prospection about the long-term trajectory of each individual's competence. Existing research on collaboration, however, rarely considers this prospective dimension. To address this gap, we studied how humans make training decisions, while manipulating the long-term consequences of those decisions. Across three experiments (N = 600), participants trained two military defense teams to counter two types of attacks (land and air), where the goals and the teams' relative competences varied. Participants made a sequence of training decisions before assigning teams to roles in the final battle. Overall, participants divided labor according to task demands, relative competences, and expectations about how collaborators would develop, and their training decisions supported these anticipated roles. These patterns were best captured by a Planning model that trained collaborators based on the expected outcome at the time of deployment. The Planning model outperformed several heuristic alternatives that optimized training based on current competence, learning potential, fairness, or versatility. Together, these findings provide a first step toward understanding how training unlocks one of the central benefits of dividing labor—complementary competences. People do not simply match existing competences to tasks; rather, they actively cultivate individual competences to support future specialization.

  • Bayesian estimation yields anti-Weber variability

    PNAS Nexus · 2025-08-29 · 2 citations

    articleOpen accessSenior author

    A classic result of psychophysics is that human perceptual estimates are more variable for larger magnitudes. This "Weber behavior," however, has typically not been the focus of the prominent Bayesian paradigm. Here, we examine the variability of a Bayesian observer in comparison with human subjects. In two preregistered experiments, we manipulate the prior distribution and the reward function in a numerosity-estimation task. When large numerosities are more frequent or more rewarding, the Bayesian observer exhibits an "anti-Weber behavior," in which larger magnitudes yield less variable responses. Human subjects exhibit a similar pattern, thus breaking a long-standing result of psychophysics. Nevertheless, subjects' responses are best reproduced by a logarithmic encoding of magnitudes, a proposal of Fechner often regarded as accounting for Weber behavior. We thus obtain an anti-Weber behavior together with a Fechner encoding. Our results suggest that the increasing variability may be primarily due to the skewness of natural priors.

  • Uncertainty-driven exploration during planning.

    Decision · 2025-10-01

    articleOpen accessSenior author
  • The successor representation in high-risk drinking and alcohol-related contexts

    2025-08-25

    preprintOpen access

    The successor representation (SR) has been suggested to underlie nuanced forms of habitual behavior and a reduced SR variant (redSR) produces addiction-like behavior in simulations. Neither of these strategies can be detected in paradigms assessing habits in humans, which are usually conducted in disorder-irrelevant contexts, and this may explain inconsistent evidence for a goal-directed-to-habitual behavior shift in addiction. We tested whether individuals with high-risk drinking behavior rely more on (red)SR, particularly in alcohol-related contexts. Findings suggest that a (reduced) random-policy SR-like strategy contributes to human behavior, but that high-risk drinkers do not differ from low-risk drinkers in their use of this strategy. Instead, both groups rely less on (reduced) random-policy SR and more on model-free control in alcohol-related contexts. Results suggest that (reduced) random-policy SR supports adaptive, resource-efficient behavior and is selectively downregulated in substance-related contexts, highlighting the importance of contextual modulation in understanding decision strategies in mental health.

  • The successor representation in high-risk drinking and alcohol-related contexts

    2025-08-10

    preprintOpen access

    The successor representation (SR) has been suggested to underlie nuanced forms of habitual behavior and a reduced SR variant (redSR) produces addiction-like behavior in simulations. Neither of these strategies can be detected in paradigms assessing habits in humans, which are usually conducted in disorder-irrelevant contexts, and this may explain inconsistent evidence for a goal-directed-to-habitual behavior shift in addiction. We tested whether individuals with high-risk drinking behavior rely more on (red)SR, particularly in alcohol-related contexts. Findings suggest that a (reduced) random-policy SR-like strategy contributes to human behavior, but that high-risk drinkers do not differ from low-risk drinkers in their use of this strategy. Instead, both groups rely less on (reduced) random-policy SR and more on model-free control in alcohol-related contexts. Results suggest that (reduced) random-policy SR supports adaptive, resource-efficient behavior and is selectively downregulated in substance-related contexts, highlighting the importance of contextual modulation in understanding decision strategies in mental health.

Recent grants

Frequent coauthors

Labs

Education

  • Ph.D., Cognitive Psychology

    University of Pennsylvania

    2009
  • B.A., Psychology

    Harvard University

    2004
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Samuel J. Gershman

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup