
Robert M. Freund
· Theresa Seley Professor in Management ScienceMassachusetts Institute of Technology · Operations Research and Statistics
Active 1977–2024
About
Robert M. Freund is the Theresa Seley Professor in Management Science and a Professor of Operations Research at the MIT Sloan School of Management. His main research interests include convex optimization, computational complexity, convex geometry, large-scale nonlinear optimization, and related mathematical systems. His recent work focuses on first-order methods and their connections to statistical and machine learning. Freund has served as coeditor of the journal Mathematical Programming and as associate editor for several optimization and operations research journals. He has held leadership roles such as Co-Director of the MIT Operations Research Center, the MIT Program in Computation for Design and Optimization, and Chair of the INFORMS Optimization Section. Freund received his BA in mathematics from Princeton University and his MS and PhD in operations research from Stanford University. He has been recognized with awards including the Longuet-Higgins Prize in computer vision and the MIT Seegal Prize for inspiring students to pursue and achieve excellence.
Research topics
- Computer Science
- Artificial Intelligence
- Mathematical optimization
- Mathematics
- Algorithm
- Applied mathematics
- Combinatorics
Selected publications
Nonlinear conjugate gradient methods: worst-case convergence rates via computer-assisted analyses
Mathematical Programming · 2024-08-22 · 6 citations
articleSIAM Journal on Optimization · 2024-07-11 · 1 citations
articleSenior authorNonlinear conjugate gradient methods: worst-case convergence rates via computer-assisted analyses
arXiv (Cornell University) · 2023-01-04 · 1 citations
preprintOpen accessWe propose a computer-assisted approach to the analysis of the worst-case convergence of nonlinear conjugate gradient methods (NCGMs). Those methods are known for their generally good empirical performances for large-scale optimization, while having relatively incomplete analyses. Using our computer-assisted approach, we establish novel complexity bounds for the Polak-Ribière-Polyak (PRP) and the Fletcher-Reeves (FR) NCGMs for smooth strongly convex minimization. In particular, we construct mathematical proofs that establish the first non-asymptotic convergence bound for FR (which is historically the first developed NCGM), and a much improved non-asymptotic convergence bound for PRP. Additionally, we provide simple adversarial examples on which these methods do not perform better than gradient descent with exact line search, leaving very little room for improvements on the same class of problems.
An Oblivious Ellipsoid Algorithm for Solving a System of (In)Feasible Linear Inequalities
Mathematics of Operations Research · 2023-03-23 · 2 citations
articleThe ellipsoid algorithm is a fundamental algorithm for computing a solution to the system of m linear inequalities in n variables [Formula: see text] when its set of solutions has positive volume. However, when [Formula: see text] is infeasible, the ellipsoid algorithm has no mechanism for proving that (P) is infeasible. This is in contrast to the other two fundamental algorithms for tackling [Formula: see text], namely, the simplex and interior-point methods, each of which can be easily implemented in a way that either produces a solution of [Formula: see text] or proves that [Formula: see text] is infeasible by producing a solution to the alternative system [Formula: see text]. This paper develops an oblivious ellipsoid algorithm (OEA) that either produces a solution of [Formula: see text] or produces a solution of [Formula: see text]. Depending on the dimensions and other condition measures, the computational complexity of the basic OEA may be worse than, the same as, or better than that of the standard ellipsoid algorithm. We also present two modified versions of OEA, whose computational complexity is superior to that of OEA when [Formula: see text]. This is achieved in the first modified version by proving infeasibility without producing a solution of [Formula: see text], and in the second version by using more memory. Funding: J. Lamperski and R. M. Freund were supported by the Air Force Office of Scientific Research [Grant FA9550-19-1-0240].
arXiv (Cornell University) · 2023-12-21
preprintOpen accessSenior authorThere has been a recent surge in development of first-order methods (FOMs) for solving huge-scale linear programming (LP) problems. The attractiveness of FOMs for LP stems in part from the fact that they avoid costly matrix factorization computation. However, the efficiency of FOMs is significantly influenced - both in theory and in practice - by certain instance-specific LP condition measures. Xiong and Freund recently showed that the performance of the restarted primal-dual hybrid gradient method (PDHG) is predominantly determined by two specific condition measures: LP sharpness and Limiting Error Ratio. In this paper we examine the relationship between these two measures, particularly in the case when the optimal solution is unique (which is generic - at least in theory), and we present an upper bound on the Limiting Error Ratio involving the reciprocal of the LP sharpness. This shows that in LP instances where there is a dual nondegenerate optimal solution, the computational complexity of restarted PDHG can be characterized solely in terms of LP sharpness and the distance to optimal solutions, and simplifies the theoretical complexity upper bound of restarted PDHG for these instances.
2023-07-20
book-chapterSenior authorAbstract This chapter outlines the evidence base for the use of paradoxical interventions (PIs) in individual psychotherapy. Often misunderstood, PIs have shown long-term (distal) impacts on clinical outcomes. Definitions of PIs and their constituent elements (double-binds, reframing, circular causality) are presented along with clinical examples. The chapter reports on one meta-analysis comparing PIs with a placebo or control and another comparing PIs to other therapeutic methods. Paradoxes demonstrated a large effect (d = 1.1, k = 17 studies) compared to controls and a medium effect size (d = 0.49, k = 17 studies) compared to other therapeutic methods. The chapter concludes with diversity considerations, training recommendations, and therapeutic practices.
Computational Guarantees for Restarted PDHG for LP based on "Limiting Error Ratios" and LP Sharpness
arXiv (Cornell University) · 2023-12-22
preprintOpen accessSenior authorIn recent years, there has been growing interest in solving linear optimization problems - or more simply "LP" - using first-order methods in order to avoid the costly matrix factorizations of traditional methods for huge-scale LP instances. The restarted primal-dual hybrid gradient method (PDHG) - together with some heuristic techniques - has emerged as a powerful tool for solving huge-scale LPs. However, the theoretical understanding of the restarted PDHG and the validation of various heuristic implementation techniques are still very limited. Existing complexity analyses have relied on the Hoffman constant of the LP KKT system, which is known to be overly conservative, difficult to compute, and fails to offer insight into instance-specific characteristics of the LP problems. These limitations have limited the capability to discern which characteristics of LP instances lead to easy versus difficult instances. With the goal of overcoming these limitations, we introduce and develop two purely geometry-based condition measures for LP instances: "limiting error ratio" and LP sharpness. We provide new computational guarantees for the restarted PDHG based on these two condition measures. For limiting error ratio, we provide a computable upper bound and show its relationship with the data instance's proximity to infeasibility under perturbation. For LP sharpness, we prove its equivalence to the stability of the LP optimal solution set under perturbation of the objective function. Our computational guarantees are validated by constructed instances. Conversely, our computational guarantees validate the practical efficacy of certain heuristic techniques (row preconditioners and step-size tuning) that improve computational performance. Finally, we present computational experiments on LP relaxations from the MIPLIB dataset that demonstrate the promise of various implementation strategies.
Mathematical Programming · 2022 · 15 citations
Senior authorCorresponding- Computer Science
- Algorithm
- Computer Science
Abstract We present and analyze a new generalized Frank–Wolfe method for the composite optimization problem $$(P): {\min }_{x\in {\mathbb {R}}^n} \; f(\mathsf {A} x) + h(x)$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi>P</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>:</mml:mo><mml:msub><mml:mo>min</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mi>n</mml:mi></mml:msup></mml:mrow></mml:msub><mml:mspace/><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>A</mml:mi><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>h</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math> , where f is a $$\theta $$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>θ</mml:mi></mml:math> -logarithmically-homogeneous self-concordant barrier, $$\mathsf {A}$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>A</mml:mi></mml:math> is a linear operator and the function h has a bounded domain but is possibly non-smooth. We show that our generalized Frank–Wolfe method requires $$O((\delta _0 + \theta + R_h)\ln (\delta _0) + (\theta + R_h)^2/\varepsilon )$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>δ</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:mi>θ</mml:mi><mml:mo>+</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>ln</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>δ</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mi>θ</mml:mi><mml:mo>+</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msup><mml:mo>/</mml:mo><mml:mi>ε</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math> iterations to produce an $$\varepsilon $$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>ε</mml:mi></mml:math> -approximate solution, where $$\delta _0$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:msub><mml:mi>δ</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:math> denotes the initial optimality gap and $$R_h$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:msub><mml:mi>R</mml:mi><mml:mi>h</mml:mi></mml:msub></mml:math> is the variation of h on its domain. This result establishes certain intrinsic connections between $$\theta $$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>θ</mml:mi></mml:math> -logarithmically homogeneous barriers and the Frank–Wolfe method. When specialized to the D -optimal design problem, we essentially recover the complexity obtained by Khachiyan (Math Oper Res 21 (2): 307–320, 1996) using the Frank–Wolfe method with exact line-search. We also study the (Fenchel) dual problem of ( P ), and we show that our new method is equivalent to an adaptive-step-size mirror descent method applied to the dual problem. This enables us to provide iteration complexity bounds for the mirror descent method despite the fact that the dual objective function is non-Lipschitz and has unbounded domain. In addition, we present computational experiments that point to the potential usefulness of our generalized Frank–Wolfe method on Poisson image de-blurring problems with TV regularization, and on simulated PET problem instances.
arXiv (Cornell University) · 2022-08-30 · 1 citations
preprintOpen accessSenior authorThe Frank-Wolfe method has become increasingly useful in statistical and machine learning applications, due to the structure-inducing properties of the iterates, and especially in settings where linear minimization over the feasible set is more computationally efficient than projection. In the setting of Empirical Risk Minimization -- one of the fundamental optimization problems in statistical and machine learning -- the computational effectiveness of Frank-Wolfe methods typically grows linearly in the number of data observations $n$. This is in stark contrast to the case for typical stochastic projection methods. In order to reduce this dependence on $n$, we look to second-order smoothness of typical smooth loss functions (least squares loss and logistic loss, for example) and we propose amending the Frank-Wolfe method with Taylor series-approximated gradients, including variants for both deterministic and stochastic settings. Compared with current state-of-the-art methods in the regime where the optimality tolerance $\varepsilon$ is sufficiently small, our methods are able to simultaneously reduce the dependence on large $n$ while obtaining optimal convergence rates of Frank-Wolfe methods, in both the convex and non-convex settings. We also propose a novel adaptive step-size approach for which we have computational guarantees. Last of all, we present computational experiments which show that our methods exhibit very significant speed-ups over existing methods on real-world datasets for both convex and non-convex binary classification problems.
Mathematical Programming · 2020 · 19 citations
Senior authorCorresponding- Computer Science
- Mathematics
- Mathematical optimization
Frequent coauthors
- 18 shared
Paul Grigas
- 15 shared
Alexandre Belloni
- 14 shared
Han Men
American Institute of Aeronautics and Astronautics
- 13 shared
J. Peraire
Massachusetts Institute of Technology
- 13 shared
Rahul Mazumder
Massachusetts Institute of Technology
- 10 shared
Ngoc Cuong Nguyen
- 10 shared
Haihao Lu
- 8 shared
Fernando Ordóñez
Awards & honors
- MIT’s 2020 Seegal prize
- INFORMS Fellow (2018)
- Longuet-Higgins Prize in computer vision (2007)
- Samuel M. Seegal Faculty Prize
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Robert M. Freund
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup