Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Madeleine Udell

Madeleine Udell

· Assistant Professor of Management Science and Engineering and, by courtesy, of Electrical EngineeringVerified

Stanford University · Management Science and Engineering

Active 2012–2026

h-index26
Citations2.7k
Papers15384 last 5y
Funding$621k
See your match with Madeleine Udell — sign in to PhdFit.Sign in

About

Madeleine Udell is an Assistant Professor of Management Science and Engineering at Stanford University, with a courtesy appointment in Electrical Engineering. She completed her PhD at Stanford in Computational and Mathematical Engineering and held a postdoctoral fellowship at the Center for the Mathematics of Information at Caltech. Her research focuses on accelerating and simplifying large-scale data analysis and optimization, aiming to address challenges across healthcare, finance, marketing, operations, and engineering systems design. Her work in optimization seeks to detect and exploit novel structures, leading to faster, more memory-efficient algorithms, automatic proofs of optimality, improved complexity guarantees, and user-friendly optimization solvers and modeling languages. In machine learning, her research centers on data preprocessing, interpretability, and causality, which are critical for practical applications involving messy data.

Research topics

  • Computer Science
  • Mathematics
  • Mathematical optimization
  • Algorithm
  • Operating system
  • Combinatorics
  • Theoretical computer science
  • Physics
  • Geometry
  • Applied mathematics

Selected publications

  • Randomized Nyström Preconditioned Interior Point-Proximal Method of Multipliers

    SIAM Journal on Scientific Computing · 2026-01-02

    articleSenior author
  • SnareNet: Flexible Repair Layers for Neural Networks with Hard Constraints

    Open MIND · 2026-02-10

    preprintSenior author

    Neural networks are increasingly used as fast surrogate models across various domains, but unconstrained predictions can violate physical, operational, or safety requirements. We propose SnareNet, a feasibility-controlled architecture to learn mappings whose outputs must satisfy input-dependent constraints. SnareNet appends a differentiable repair layer that navigates in the constraint map's range space, steering iterates toward feasibility and producing a repaired output that satisfies constraints to a user-specified tolerance. We stabilize end-to-end training by adaptive relaxation, a new training paradigm that snares the neural network at initialization and shrinks it into the feasible set, enabling early exploration and strict feasibility later in training. On optimization learning and trajectory planning benchmarks, SnareNet consistently attains improved objective quality while satisfying constraints more reliably than prior work, and it is the first to enforce non-convex constraints at medium-to-high precision robustly across instances.

  • SnareNet: Flexible Repair Layers for Neural Networks with Hard Constraints

    ArXiv.org · 2026-02-10

    articleOpen accessSenior author

    Neural networks are increasingly used as fast surrogate models across various domains, but unconstrained predictions can violate physical, operational, or safety requirements. We propose SnareNet, a feasibility-controlled architecture to learn mappings whose outputs must satisfy input-dependent constraints. SnareNet appends a differentiable repair layer that navigates in the constraint map's range space, steering iterates toward feasibility and producing a repaired output that satisfies constraints to a user-specified tolerance. We stabilize end-to-end training by adaptive relaxation, a new training paradigm that snares the neural network at initialization and shrinks it into the feasible set, enabling early exploration and strict feasibility later in training. On optimization learning and trajectory planning benchmarks, SnareNet consistently attains improved objective quality while satisfying constraints more reliably than prior work, and it is the first to enforce non-convex constraints at medium-to-high precision robustly across instances.

  • Scalable approximate optimal diagonal preconditioning

    Computational Optimization and Applications · 2026-02-14

    article
  • Turbocharging Gaussian Process Inference with Approximate Sketch-and-Project

    ArXiv.org · 2025-05-19

    preprintOpen accessSenior author

    Gaussian processes (GPs) play an essential role in biostatistics, scientific machine learning, and Bayesian optimization for their ability to provide probabilistic predictions and model uncertainty. However, GP inference struggles to scale to large datasets (which are common in modern applications), since it requires the solution of a linear system whose size scales quadratically with the number of samples in the dataset. We propose an approximate, distributed, accelerated sketch-and-project algorithm ($\texttt{ADASAP}$) for solving these linear systems, which improves scalability. We use the theory of determinantal point processes to show that the posterior mean induced by sketch-and-project rapidly converges to the true posterior mean. In particular, this yields the first efficient, condition number-free algorithm for estimating the posterior mean along the top spectral basis functions, showing that our approach is principled for GP inference. $\texttt{ADASAP}$ outperforms state-of-the-art solvers based on conjugate gradient and coordinate descent across several benchmark datasets and a large-scale Bayesian optimization task. Moreover, $\texttt{ADASAP}$ scales to a dataset with $> 3 \cdot 10^8$ samples, a feat which has not been accomplished in the literature.

  • Diverse Inference and Verification for Advanced Reasoning

    ArXiv.org · 2025-02-14

    preprintOpen accessSenior author

    Reasoning LLMs such as OpenAI o1, o3 and DeepSeek R1 have made significant progress in mathematics and coding, yet find challenging advanced tasks such as International Mathematical Olympiad (IMO) combinatorics problems, Abstraction and Reasoning Corpus (ARC) puzzles, and Humanity's Last Exam (HLE) questions. We use a diverse inference approach that combines multiple models and methods at test time. We find that verifying mathematics and code problems, and rejection sampling on other problems is simple and effective. We automatically verify correctness of solutions to IMO problems by Lean, and ARC puzzles by code, and find that best-of-N effectively answers HLE questions. Our approach increases answer accuracy on IMO combinatorics problems from 33.3% to 77.8%, accuracy on HLE questions from 8% to 37%, and solves 80% of ARC puzzles that 948 humans could not and 26.5% of ARC puzzles that o3 high compute does not. Test-time simulations, reinforcement learning, and meta-learning with inference feedback improve generalization by adapting agent graph representations and varying prompts, code, and datasets. Our approach is reliable, robust, and scalable, and in the spirit of reproducible research, we will make it publicly available upon publication.

  • New Results on the Polyak Stepsize: Tight Convergence Analysis and Universal Function Classes

    arXiv (Cornell University) · 2025-12-06

    preprintOpen access

    In this paper, we revisit a classical adaptive stepsize strategy for gradient descent: the Polyak stepsize (PolyakGD), originally proposed in Polyak (1969). We study the convergence behavior of PolyakGD from two perspectives: tight worst-case analysis and universality across function classes. As our first main result, we establish the tightness of the known convergence rates of PolyakGD by explicitly constructing worst-case functions. In particular, we show that the $O((1-\frac{1}κ)^K)$ rate for smooth strongly convex functions and the $O(1/K)$ rate for smooth convex functions are both tight. Moreover, we theoretically show that PolyakGD automatically exploits floating-point errors to escape the worst-case behavior. Our second main result provides new convergence guarantees for PolyakGD under both Hölder smoothness and Hölder growth conditions. These findings show that the Polyak stepsize is universal, automatically adapting to various function classes without requiring prior knowledge of problem parameters.

  • SAPPHIRE: Preconditioned Stochastic Variance Reduction for Faster Large-Scale Statistical Learning

    ArXiv.org · 2025-01-27

    preprintOpen accessSenior author

    Regularized empirical risk minimization (rERM) has become important in data-intensive fields such as genomics and advertising, with stochastic gradient methods typically used to solve the largest problems. However, ill-conditioned objectives and non-smooth regularizers undermine the performance of traditional stochastic gradient methods, leading to slow convergence and significant computational costs. To address these challenges, we propose the $\texttt{SAPPHIRE}$ ($\textbf{S}$ketching-based $\textbf{A}$pproximations for $\textbf{P}$roximal $\textbf{P}$reconditioning and $\textbf{H}$essian $\textbf{I}$nexactness with Variance-$\textbf{RE}$educed Gradients) algorithm, which integrates sketch-based preconditioning to tackle ill-conditioning and uses a scaled proximal mapping to minimize the non-smooth regularizer. This stochastic variance-reduced algorithm achieves condition-number-free linear convergence to the optimum, delivering an efficient and scalable solution for ill-conditioned composite large-scale convex machine learning problems. Extensive experiments on lasso and logistic regression demonstrate that $\texttt{SAPPHIRE}$ often converges $20$ times faster than other common choices such as $\texttt{Catalyst}$, $\texttt{SAGA}$, and $\texttt{SVRG}$. This advantage persists even when the objective is non-convex or the preconditioner is infrequently updated, highlighting its robust and practical effectiveness.

  • PDE-SHARP: PDE Solver Hybrids through Analysis and Refinement Passes

    ArXiv.org · 2025-10-31

    preprintOpen accessSenior author

    Current LLM-driven approaches using test-time computing to generate PDE solvers execute a large number of solver samples to identify high-accuracy solvers. These paradigms are especially costly for complex PDEs requiring substantial computational resources for numerical evaluation. We introduce PDE-SHARP, a framework to reduce computational costs by replacing expensive scientific computation by cheaper LLM inference that achieves superior solver accuracy with 60-75% fewer computational evaluations. PDE-SHARP employs three stages: (1) Analysis: mathematical chain-of-thought analysis including PDE classification, solution type detection, and stability analysis; (2) Genesis: solver generation based on mathematical insights from the previous stage; and (3) Synthesis: collaborative selection-hybridization tournaments in which LLM judges iteratively refine implementations through flexible performance feedback. To generate high-quality solvers, PDE-SHARP requires fewer than 13 solver evaluations on average compared to 30+ for baseline methods, improving accuracy uniformly across tested PDEs by $4\times$ on average, and demonstrates robust performance across LLM architectures, from general-purpose to specialized reasoning models.

  • dnamite: A Python Package for Neural Additive Models

    ArXiv.org · 2025-03-06

    preprintOpen accessSenior author

    Additive models offer accurate and interpretable predictions for tabular data, a critical tool for statistical modeling. Recent advances in Neural Additive Models (NAMs) allow these models to handle complex machine learning tasks, including feature selection and survival analysis, on large-scale data. This paper introduces dnamite, a Python package that implements NAMs for these advanced applications. dnamite provides a scikit-learn style interface to train regression, classification, and survival analysis NAMs, with built-in support for feature selection. We describe the methodology underlying dnamite, its design principles, and its implementation. Through an application to the MIMIC III clinical dataset, we demonstrate the utility of dnamite in a real-world setting where feature selection and survival analysis are both important. The package is publicly available via pip and documented at dnamite.readthedocs.io.

Recent grants

Frequent coauthors

  • Joel A. Tropp

    California Institute of Technology

    22 shared
  • Lijun Ding

    Wisconsin Institutes for Discovery

    21 shared
  • Volkan Cevher

    16 shared
  • Stephen Boyd

    15 shared
  • Alp Yurtsever

    15 shared
  • Zachary Frangella

    Stanford University

    14 shared
  • Nathan Kallus

    13 shared
  • Iddo Drori

    13 shared

Education

  • Ph.D., Management Science and Engineering

    Stanford University

    2010
  • M.S., Management Science and Engineering

    Stanford University

    2006
  • B.S., Electrical Engineering and Computer Science

    Massachusetts Institute of Technology

    2002

Awards & honors

  • Kavli Fellowship (2023)
  • Alfred P. Sloan Research Fellowship (2021)
  • NSF CAREER award (2020)
  • ONR Young Investigator Award (2020)
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Madeleine Udell

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup