John H. Reppy

· Associate Professor of Computer ScienceVerified

University of Chicago · Computer Science

Active 1987–2026

h-index31

Citations3.5k

Papers2496 last 5y

Funding$1.8M

Faculty page Lab page Website

See your match with John H. Reppy — sign in to PhdFit.Sign in

About

John H. Reppy is a Professor of Computer Science at the University of Chicago and the Director of the Masters Program in Computer Science. He has been a faculty member at UChicago since the Autumn of 2002, following eleven years as an MTS at Bell Labs in Murray Hill, New Jersey. His primary research focus is on the design and implementation of advanced programming languages, including functional, object-oriented, and concurrent languages, with an emphasis on high-level languages for parallel programming. Reppy's work aims to enhance software quality, reliability, and programmer productivity. He has contributed to the development of several advanced languages and language features, such as Diderot, a declarative domain-specific language for image-analysis algorithms on parallel systems, and the Manticore project, which develops language features and implementation techniques for multicore and small-scale SMP systems. Reppy has also worked on Concurrent ML, a concurrent programming language embedded in SML, and the Moby programming language, which supports object-oriented and concurrent programming. His contributions extend to the Standard ML of New Jersey system, and he has an interest in computer graphics, having designed the ray-tracer problem for the ICFP 2000 Programming Contest and being the primary implementor of the SML3d library. In addition to his research, Reppy has held roles such as a rotating Program Director at the NSF in the CCF division from 2011 to 2013. His work spans systems, programming languages, and software engineering, and he is involved in collaborative research communities and NSF initiatives like EPiQC, which focuses on practical-scale quantum computing.

Research topics

Computer Science
Programming language
Artificial Intelligence
Theoretical computer science
Mathematics
Parallel computing

Selected publications

Artifact for "Flow-analysis-based Closure Optimizations"
Zenodo (CERN European Organization for Nuclear Research) · 2026-03-17
otherOpen access1st authorCorresponding
Artifact for "Flow-Analysis-Based Closure Optimizations" The artifact includes the source code of a modified Standard ML of New Jersey compiler (version 2025.2). It contains the implementation of the analysis and the new closure conversion algorithms. It is set up to run a set of benchmark programs to compare the runtimes, the memory allocation statisitics, and the compile-time impact of the new closure converter. See the README for instructions.
Publisher DOI
Environment-Sharing Analysis and Caller-Provided Environments for Higher-Order Languages
Proceedings of the ACM on Programming Languages · 2025-08-05
articleOpen access
The representation of functions in higher-order languages includes both the function’s code and an environment structure that captures the bindings of the function’s free variables. This paper explores caller-provided environments, where instead of packaging the entirety of a function’s environment in its closure, a function can be provided with a portion of its environment by its caller. In higher-order languages, it is difficult to determine where functions are called, let alone what pieces of the function’s environment are available to be provided by the caller, thus we need a higher-order control-flow analysis to enable caller-provided environments. In this paper, we present a new abstract-interpretation-based analysis that discovers which pieces of a function’s environment are always shared between its definition and its callers. In such cases, the caller can provide the environment to the callee. Our analysis has been formalized in the Rocq proof assistant. We evaluate our analysis on a collection of programs demonstrating that it is both scalable and provides significantly better information over the common syntactic approach and better information than lightweight closure conversion . In fact, it yields the theoretical upper-bound for many programs. For caller-provided environments, deciding how to transform the program based on these revealed facts is also non-trivial and has the potential to incur extra runtime cost over standard strategies. We discuss how to make these decisions in a way that avoids the extra costs and how to transform a program accordingly. We also propose other uses of the analysis results beyond enabling caller-provided environments. We evaluate our transformation using an instrumented interpreter, showing that our approach is effective in reducing dynamic allocations for environments.
Publisher DOI
Webs and Flow-Directed Well-Typedness Preserving Program Transformations
Proceedings of the ACM on Programming Languages · 2025-06-10 · 1 citations
articleOpen access
We define webs to be the collections of producers and consumers ( e.g. , functions and calls) in a program that are constrained: in higher-order languages, multiple functions can flow to the same call, all of which must agree on an interface ( e.g. , calling convention). We argue that webs are fundamentally the unit of transformation : a change to one member requires changes across the entire web. We introduce a web-centric intermediate language that exposes webs as annotations, and describe web-based (that is, flow-directed) transformations guided by these annotations. As they affect all members of a web, these transformations are interprocedural, operating over entire modules. Through the lens of webs we reframe and generalize a collection of transformations from the literature, including dead-parameter elimination, uncurrying, and defunctionalization, as well as describe novel transformations. We contrast this approach with rewriting strategies that rely on inlining and cascading rewrites. Webs are an over-approximation of the semantic function-call relationship produced by control-flow analyses (CFA). This information is inherently independent from the transformations; more precise analyses permit more transformations. A limitation of precise analyses is that the transformations may not maintain well-typedness, as the type system is a less-precise static analysis. Our solution is a simple and lightweight typed-based analysis that causes the flow-directed transformations to preserve well-typedness, making flow-directed, type-preserving transformations easily accessible in many compilers. This analysis builds on unification, distinguishing types that look the same from types that have to be the same. Our experiments show that while our analysis is theoretically less precise, in practice its precision is similar to CFAs.
Publisher DOI
Analyzing binding extent in 3CPS
Proceedings of the ACM on Programming Languages · 2022 · 4 citations
- Computer Science
- Computer Science
- Programming language
To date, the most effective approach to compiling strict, higher-order functional languages (such as OCaml, Scheme, and SML) has been to use whole-program techniques to convert the program to a first-order monomorphic representation that can be optimized using traditional compilation techniques. This approach, popularized by MLton, has limitations, however. We are interested in exploring a different approach to compiling such languages, one that preserves the higher-order and polymorphic character of the program throughout optimization. To enable such an approach, we must have effective analyses that both provide precise information about higher-order programs and that scale to larger units of compilation. This paper describes one such analysis for determining the extent of variable bindings. We classify the extent of variables as either register (only one binding instance can be live at any time), stack (the lifetimes of binding instances obey a LIFO order), or heap (binding lifetimes are arbitrary). These extents naturally connect variables to the machine resources required to represent them. We believe that precise information about binding extents will enable efficient management of environments, which is a key problem in the efficient compilation of higher-order programs. At the core of the paper is the 3CPS intermediate representation, which is a factored CPS-based intermediate representation (IR) that statically marks variables to indicate their binding extent. We formally specify the management of this binding structure by means of a small-step operational semantics and define a static analysis that determines the extents of the variables in a program. We evaluate our analysis using a standard suite of SML benchmark programs. Our implementation gets surprisingly high yield and exhibits scalable performance. While this paper uses a CPS-based IR, the algorithm and results are easily transferable to other λ-calculus IRs, such as ANF.
Publisher OA PDF DOI
3CPS: The Design of an Environment-Focussed Intermediate Representation
2021-09-01 · 2 citations
articleOpen access
We describe the design of 3CPS, a compiler intermediate representation (IR) we have developed for use in compiling call-by-value functional languages such as SML, OCaml, Scheme, and Lisp. The language is a low-level form designed in tandem with a matching suite of static analyses. It reflects our belief that the core task of an optimising compiler for a functional language is to reason about the environment structure of the program. Our IR is distinguished by the presence of extent annotations, added to all variables (and verified by static analysis). These annotations are defined in terms of the semantics of the IR, but they directly tell the compiler what machine resources are needed to implement the environment structure of each annotated variable.
Publisher OA PDF DOI
Replication Package for Article: From Folklore to Fact: Comparing Implementations of Stacks and Continuations
Artifact Digital Object Group · 2020-06-24 · 2 citations
datasetSenior author
Publisher DOI
The history of Standard ML
Proceedings of the ACM on Programming Languages · 2020 · 11 citations
Senior authorCorresponding
- Computer Science
- Programming language
- Computer Science
The ML family of strict functional languages, which includes F#, OCaml, and Standard ML, evolved from the Meta Language of the LCF theorem proving system developed by Robin Milner and his research group at the University of Edinburgh in the 1970s. This paper focuses on the history of Standard ML, which plays a central role in this family of languages, as it was the first to include the complete set of features that we now associate with the name “ML” (i.e., polymorphic type inference, datatypes with pattern matching, modules, exceptions, and mutable state). Standard ML, and the ML family of languages, have had enormous influence on the world of programming language design and theory. ML is the foremost exemplar of a functional programming language with strict evaluation (call-by-value) and static typing. The use of parametric polymorphism in its type system, together with the automatic inference of such types, has influenced a wide variety of modern languages (where polymorphism is often referred to as generics ). It has popularized the idea of datatypes with associated case analysis by pattern matching. The module system of Standard ML extends the notion of type-level parameterization to large-scale programming with the notion of parametric modules, or functors . Standard ML also set a precedent by being a language whose design included a formal definition with an associated metatheory of mathematical proofs (such as soundness of the type system). A formal definition was one of the explicit goals from the beginning of the project. While some previous languages had rigorous definitions, these definitions were not integral to the design process, and the formal part was limited to the language syntax and possibly dynamic semantics or static semantics, but not both. The paper covers the early history of ML, the subsequent efforts to define a standard ML language, and the development of its major features and its formal definition. We also review the impact that the language had on programming-language research.
Publisher OA PDF DOI
From folklore to fact: comparing implementations of stacks and continuations
2020 · 15 citations
Senior authorCorresponding
- Computer Science
- Computer Science
- Programming language
The efficient implementation of function calls and non-local control transfers is a critical part of modern language implementations and is important in the implementation of everything from recursion, higher-order functions, concurrency and coroutines, to task-based parallelism. In a compiler, these features can be supported by a variety of mechanisms, including call stacks, segmented stacks, and heap-allocated continuation closures.
Publisher OA PDF DOI
A New Backend for Standard ML of New Jersey
2020-09-02 · 1 citations
articleOpen accessSenior author
This paper describes the design and implementation of a new backend for the Standard ML of New Jersey (SML/NJ) system that is based on the LLVM compiler infrastructure. We first describe the history and design of the current backend, which is based on the MLRisc framework. While MLRisc has many similarities to LLVM, it provides a lower-level, policy-agnostic, approach to code generation that enables customization of the code generator for non-standard runtime models (i.e., register pinning, calling conventions, etc.). In particular, SML/NJ uses a stackless runtime model based on continuation-passing style with heap-allocated continuation closures. This feature, and others, pose challenges to building a backend using LLVM. We describe these challenges and how we address them in our backend.
Publisher OA PDF DOI
Point Movement in a DSL for Higher-Order FEM Visualization
2019-10-01
preprintOpen access
Scientific visualization tools tend to be flexible in some ways (e.g., for exploring isovalues) while restricted in other ways, such as working only on regular grids, or only on unstructured meshes (as used in the finite element method, FEM). Our work seeks to expose the common structure of visualization methods, apart from the specifics of how the fields being visualized are formed. Recognizing that previous approaches to FEM visualization depend on efficiently updating computed positions within a mesh, we took an existing visualization domain-specific language, and added a mesh position type and associated arithmetic operators. These are orthogonal to the visualization method itself, so existing programs for visualizing regular grid data work, with minimal changes, on higher-order FEM data. We reproduce the efficiency gains of an earlier guided search method of mesh position update for computing streamlines, and we demonstrate a novel ability to uniformly sample ridge surfaces of higher-order FEM solutions defined on curved meshes.
Publisher OA PDF DOI

Recent grants

SHF: Medium: A DSL for Data Visualization and Analysis in Imaging-Based Science and Scientific Computing
NSF · $1.2M · 2016–2021
EAGER: Exploring the Foundations of High-Level Programming Models for GPUs
NSF · $275k · 2014–2017
SHF: Small: High-Level Programming Models for GPUs
NSF · $390k · 2017–2021

Frequent coauthors

Guy L. Steele
Oracle (United States)
414 shared
Laxmikant V. Kalé
University of Illinois Urbana-Champaign
315 shared
Charles E. Leiserson
106 shared
Edward F. Gehringer
106 shared
Cédric Bastoul
106 shared
Pen-Chung Yew
106 shared
H. Peter Hofstee
105 shared
Patrick H Worley
105 shared

Labs

ManticorePI
Parallel functional programming for multicore systems

Education

Ph.D.
University of Chicago
M.S.
University of Chicago
B.S.
University of Chicago

Awards & honors

Peter Landin Prize
IFL Distinguished paper award, PLDI (2020)
Best paper award, EuroVis (2018)

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with John H. Reppy

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you