Larry Hedges
· Board of Trustees Professor of Statistics and Social PolicyVerifiedNorthwestern University · Social Policy Analysis and Evaluation
Active 1975–2026
Research topics
- Computer Science
- Statistics
- Econometrics
- Medicine
- Psychology
- Mathematics
- Biology
- Business
- Psychiatry
- Gerontology
- Data science
- Actuarial science
- Pediatrics
- Internal medicine
Selected publications
Designing Experiments in Education and the Social Sciences
Russell Sage Foundation eBooks · 2026-04-28
book1st authorCorrespondingThis book focuses on the design of randomized field trials in education and related social sciences. The author begins by discussing the fundamental principles of research design, including a history of their use in science. He also discusses the logistics of planning and organizing trials, as well as statistical principles, measurement, and effect size for the purpose of showing how they are related to design sensitivity. The book concentrates on four fundamental designs that are most often used in education and the evaluation of social programs. These four designs lead to forty-four different designs, each of which has different sensitivity depending on different design parameters. These designs are particularly relevant to social science because they both illustrate and embody the fact that social experiments are embedded in social contexts that, through design necessities and design parameters, affect design sensitivity.
Exploring randomised controlled trials in education: key features and selected examples from Ireland
Irish Educational Studies · 2025-07-29 · 1 citations
articleOpen accessSenior authorRedefine statistical significance
Artefactual Field Experiments · 2025-01-10 · 21 citations
articleOpen accessComputing Statistical Power for the Difference in Differences Design
Evaluation Review · 2025-09-22
articleSenior authorThe difference in differences design is widely used to assess treatment effects in natural experiments or other situations where random assignment cannot, or is not, used (see, e.g., Angrist & Pischke, 2009). The researcher must make important decisions about which comparisons to make, the measurements to make, and perhaps the number of individuals whose data is included in each timepoint. Also, interpretation of any statistical results, particularly null results, is improved by understanding the sensitivity of the design. This paper describes methods for computing the statistical power for tests of treatment effects in the difference in differences design. We describe alternative approaches to the analysis of the design, show which are equivalent, and provide expressions for computing statistical power and determining minimum detectable effect sizes. We then discuss how these methods could be generalized to unbalanced designs, designs with covariates, and designs more than two timepoints including difference in difference in differences designs.
Effect sizes for experimental research
British Journal of Mathematical and Statistical Psychology · 2025-03-31 · 11 citations
articleOpen access1st authorCorrespondingGood scientific practice requires that the reporting of the statistical analysis of experiments should include estimates of effect size as well as the results of tests of statistical significance. Good statistical practice requires that effect size estimates be reported along with some indication of their statistical uncertainty, such as a standard error. This article provides a review of effect sizes for experimental research, including expressions for the standard error of each effect size. It focuses on effect sizes for experiments with treatments having a single degree of freedom but also includes effect sizes for treatments with multiple degrees of freedom having either fixed or random effects.
Correcting the Variance of Effect Sizes Based on Binary Outcomes for Clustering
Educational and Psychological Measurement · 2025-10-23
article1st authorCorrespondingResearchers conducting systematic reviews and meta-analyses often encounter studies in which the research design is a well conducted cluster randomized trial, but the statistical analysis does not take clustering into account. For example, the study might assign treatments by clusters but the analysis may not take into account the clustered treatment assignment. Alternatively, the analysis of the primary outcome of the study might take clustering into account, but the reviewer might be interested in another outcome for which only summary data are available in a form that does not take clustering into account. This article provides expressions for the approximate variance of risk differences, log risk ratios, and log odds ratios computed from clustered binary data, using the intraclass correlations. An example illustrates the calculations. References to empirical estimates of intraclass correlations are provided.
Elsevier eBooks · 2025-01-01
book-chapter1st authorCorrespondingEducational and Psychological Measurement · 2024-10-06 · 16 citations
articleOpen access1st authorCorrespondingThe standardized mean difference (sometimes called Cohen's d) is an effect size measure widely used to describe the outcomes of experiments. It is mathematically natural to describe differences between groups of data that are normally distributed with different means but the same standard deviation. In that context, it can be interpreted as determining several indexes of overlap between the two distributions. If the data are not approximately normally distributed or if they have substantially unequal standard deviations, the relation between d and overlap between distributions can be very different, and interpretations of d that apply when the data are normal with equal variances are unreliable.
Educational Psychology Review · 2024-09-25 · 6 citations
articleOpen accessAbstract Well-chosen covariates boost the design sensitivity of individually and cluster-randomized trials. We provide guidance on covariate selection generating an extensive compilation of single- and multilevel design parameters on student achievement. Embedded in psychometric heuristics, we analyzed (a) covariate types of varying bandwidth-fidelity, namely domain-identical (IP), cross-domain (CP), and fluid intelligence (Gf) pretests, as well as sociodemographic characteristics (SC); (b) covariate combinations quantifying incremental validities of CP, Gf, and/or SC beyond IP; and (c) covariate time lags of 1–7 years, testing validity degradation in IP, CP, and Gf. Estimates from six German samples (1868 ≤ N ≤ 10,543) covering various outcome domains across grades 1–12 were meta-analyzed and included in precision simulations. Results varied widely by grade level, domain, and hierarchical level. In general, IP outperformed CP, which slightly outperformed Gf and SC. Benefits from coupling IP with CP, Gf, and/or SC were small. IP appeared most affected by temporal validity decay. Findings are applied in illustrative scenarios of study planning and enriched by comprehensive Online Supplemental Material (OSM) accessible via the Open Science Framework (OSF; https://osf.io/nhx4w ).
OSF Preprints (OSF Preprints) · 2024-05-21
other1st authorCorresponding
Recent grants
Improving the Generalizability of Findings from Educational Evaluations
NSF · $998k · 2011–2016
Enviromental & Biological Variation and Language Growth
NIH · $34.1M · 2002–2020
Methods for assessing replication
NSF · $1.1M · 2018–2023
Frequent coauthors
- 66 shared
Ellen Sogolow
Lehigh University
- 62 shared
Salaam Semaan
Centers for Disease Control and Prevention
- 62 shared
Wayne D. Johnson
- 55 shared
Ingram Olkin
Stanford University
- 41 shared
Karl Y. Bilimoria
Indiana University School of Medicine
- 39 shared
Jeanette W. Chung
Northwestern University
- 39 shared
Clifford Y. Ko
University of California, Los Angeles
- 38 shared
Remi Love
Anesthesia Quality Institute
Education
- 1980
PhD
Stanford University
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Larry Hedges
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup