Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Harvey Lederman

Harvey Lederman

· Professor, PhilosophyVerified

University of Texas at Austin · Philosophy

Active 2006–2026

h-index11
Citations391
Papers3416 last 5y
Funding
See your match with Harvey Lederman — sign in to PhdFit.Sign in

About

Harvey Lederman is a professor of philosophy at the University of Texas at Austin. Prior to his current position, he served as an assistant professor and then a professor of philosophy at Princeton University until 2023. Beginning in July 2026, he will join New York University. His research encompasses a broad range of interests in contemporary philosophy as well as the history of philosophy, with a particular focus on Chinese neo-Confucianism. Recently, Lederman has concentrated on conceptual and empirical questions related to AI mentality and the implications of artificial intelligence for the meaning of human life. He is co-principal investigator of the AI and Human Objectives Initiative and is affiliated with the Population and Wellbeing Initiative and the School of Civic Leadership. His scholarly contributions have been recognized with awards such as the 2023 Sanders Prize in Epistemology for his paper "Of marbles and matchsticks," which addresses incomplete preferences in decision theory, and the Dao Best Essay Award in 2022 for his work on Wang Yangming titled "What is the 'Unity' in the 'Unity of Knowledge and Action'?"

Research topics

  • Epistemology
  • Philosophy
  • Computer Science
  • Linguistics
  • Programming language
  • Artificial Intelligence
  • Natural Language Processing
  • Theology

Selected publications

  • Emergent Introspection in AI is Content-Agnostic

    Open MIND · 2026-03-05

    preprint1st authorCorresponding

    Introspection is a foundational cognitive ability, but its mechanism is not well understood. Recent work has shown that AI models can introspect. We study the mechanism of this introspection. We first extensively replicate Lindsey (2025)'s thought injection detection paradigm in large open-source models. We show that introspection in these models is content-agnostic: models can detect that an anomaly occurred even when they cannot reliably identify its content. The models confabulate injected concepts that are high-frequency and concrete (e.g., "apple"). They also require fewer tokens to detect an injection than to guess the correct concept (with wrong guesses coming earlier). We argue that a content-agnostic introspective mechanism is consistent with leading theories in philosophy and psychology.

  • Dissociating Direct Access from Inference in AI Introspection

    arXiv (Cornell University) · 2026-03-05

    articleOpen access1st authorCorresponding

    Introspection is a foundational cognitive ability, but its mechanism is not well understood. Recent work has shown that AI models can introspect. We study their mechanism of introspection, first extensively replicating Lindsey et al. (2025)'s thought injection detection paradigm in large open-source models. We show that these models detect injected representations via two separable mechanisms: (i) probability-matching (inferring from perceived anomaly of the prompt) and (ii) direct access to internal states. The direct access mechanism is content-agnostic: models detect that an anomaly occurred but cannot reliably identify its semantic content. The two model classes we study confabulate injected concepts that are high-frequency and concrete (e.g., "apple'"); for them correct concept guesses typically require significantly more tokens. This content-agnostic introspective mechanism is consistent with leading theories in philosophy and psychology.

  • Privileged Self-Access Matters for Introspection in AI

    ArXiv.org · 2025-08-20

    preprintOpen access

    Whether AI models can introspect is an increasingly important practical question. But there is no consensus on how introspection is to be defined. Beginning from a recently proposed ''lightweight'' definition, we argue instead for a thicker one. According to our proposal, introspection in AI is any process which yields information about internal states through a process more reliable than one with equal or lower computational cost available to a third party. Using experiments where LLMs reason about their internal temperature parameters, we show they can appear to have lightweight introspection while failing to meaningfully introspect per our proposed definition.

  • A Dominance Argument Against Incompleteness

    The Philosophical Review · 2025-10-01 · 1 citations

    article

    This article presents a new argument against many forms of moral and prudential value incompleteness. The argument relies on two central principles: (i) a weak “negative dominance” principle, to the effect that lottery 1 is better than lottery 2 only if some possible outcome of lottery 1 is better than some possible outcome of lottery 2, and (ii) a weak form of ex ante Pareto, to the effect that, if lottery 1 gives an unambiguously better (stochastically dominant) prospect to some individuals than lottery 2, and equally good prospects to everyone else, then lottery 1 is better than lottery 2. Given modest auxiliary assumptions, these two principles rule out incompleteness in the prudential ranking of individual lives, and many forms of incompleteness in the moral rankings of outcomes and lotteries.

  • On the Value of Irreplaceable Objects Forthcoming in The Journal of Philosophy

    Durham Research Online (Durham University) · 2025-04-23

    articleOpen access

    Bradford (2023) calls attention to the fact that the strength of our reasons to preserve distinctively valuable objects increases as the number of such objects decreases. Bradford develops an account of this phenomenon in terms of 'irreplaceable value', and in particular in terms of a notion of the degree of such value, which is distinct from its amount. We present an alternative explanation of this pattern in our reasons, which appeals to the value of diversity: the world is better, other things equal, insofar as it contains more kinds of value. We develop this view in two connected ways: one appeals to evidential probability under conditions of uncertainty, and the other appeals to the value of diversity. We conclude by discussing some explanatory advantages of our view over Bradford's.

  • Are Language Models More Like Libraries or Like Librarians? Bibliotechnism, the Novel Reference Problem, and the Attitudes of LLMs

    Transactions of the Association for Computational Linguistics · 2024-01-01 · 1 citations

    articleOpen access1st authorCorresponding

    Abstract Are LLMs cultural technologies like photocopiers or printing presses, which transmit information but cannot create new content? A challenge for this idea, which we call bibliotechnism, is that LLMs generate novel text. We begin with a defense of bibliotechnism, showing how even novel text may inherit its meaning from original human-generated text. We then argue that bibliotechnism faces an independent challenge from examples in which LLMs generate novel reference, using new names to refer to new entities. Such examples could be explained if LLMs were not cultural technologies but had beliefs, desires, and intentions. According to interpretationism in the philosophy of mind, a system has such attitudes if and only if its behavior is well explained by the hypothesis that it does. Interpretationists may hold that LLMs have attitudes, and thus have a simple solution to the novel reference problem. We emphasize, however, that interpretationism is compatible with very simple creatures having attitudes and differs sharply from views that presuppose these attitudes require consciousness, sentience, or intelligence (topics about which we make no claims).

  • Maximal Social Welfare Relations on Infinite Populations Satisfying Permutation Invariance

    arXiv (Cornell University) · 2024-08-11

    preprintOpen accessSenior author

    We study social welfare relations (SWRs) on an infinite population. Our main result is a new characterization of a utilitarian SWR as the \emph{largest} SWR (in terms of subset when the weak relation is viewed as a set of pairs) which satisfies Strong Pareto, Permutation Invariance (elsewhere called ``Relative Anonymity'' and ``Isomorphism Invariance''), and a further ``Quasi-Independence'' axiom.

  • A Dominance Argument Against Incompleteness

    arXiv (Cornell University) · 2024-03-26

    preprintOpen access

    This article presents a new argument against many forms of moral and prudential value incompleteness. The argument relies on two central principles: (i) a weak "negative dominance" principle, to the effect that Lottery 1 is better than Lottery 2 only if some possible outcome of Lottery 1 is better than some possible outcome of Lottery 2, and (ii) a weak form of ex ante Pareto, to the effect that, if Lottery 1 gives an unambiguously better (stochastically dominant) prospect to some individuals than Lottery 2, and equally good prospects to everyone else, then Lottery 1 is better than Lottery 2. Given modest auxiliary assumptions, these two principles rule out incompleteness in the prudential ranking of individual lives, and many forms of incompleteness in the moral rankings of outcomes and lotteries.

  • Are Language Models More Like Libraries or Like Librarians? Bibliotechnism, the Novel Reference Problem, and the Attitudes of LLMs

    arXiv (Cornell University) · 2024-01-10 · 9 citations

    preprintOpen access1st authorCorresponding

    Are LLMs cultural technologies like photocopiers or printing presses, which transmit information but cannot create new content? A challenge for this idea, which we call bibliotechnism, is that LLMs generate novel text. We begin with a defense of bibliotechnism, showing how even novel text may inherit its meaning from original human-generated text. We then argue that bibliotechnism faces an independent challenge from examples in which LLMs generate novel reference, using new names to refer to new entities. Such examples could be explained if LLMs were not cultural technologies but had beliefs, desires, and intentions. According to interpretationism in the philosophy of mind, a system has such attitudes if and only if its behavior is well explained by the hypothesis that it does. Interpretationists may hold that LLMs have attitudes, and thus have a simple solution to the novel reference problem. We emphasize, however, that interpretationism is compatible with very simple creatures having attitudes and differs sharply from views that presuppose these attitudes require consciousness, sentience, or intelligence (topics about which we make no claims).

  • Trying without fail

    Philosophical Studies · 2024-08-20 · 11 citations

    articleOpen accessSenior authorCorresponding

Frequent coauthors

  • Hartry Field

    New York University

    9 shared
  • Tore Fjetland Øgaard

    University of Bergen

    9 shared
  • Jeremy Goodman

    7 shared
  • P. Fritz

    University of Oslo

    6 shared
  • H. R. G. Greaves

    University of Oxford

    2 shared
  • Kyle Mahowald

    1 shared
  • Christian Tarsney

    The University of Texas at Austin

    1 shared
  • Dean Spears

    Indian Statistical Institute

    1 shared
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Harvey Lederman

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup