
Yousef Saad
VerifiedUniversity of Minnesota · Computer Science and Engineering
Active 1974–2026
About
Yousef Saad is a Professor in the Department of Computer Science & Engineering at the University of Minnesota, where he has been a faculty member since 1990. He holds the title of CSE Distinguished Professor and the William Norris Land Grant Chair in Large-Scale Computing. Saad's research focuses on numerical linear algebra, sparse matrix computations, iterative methods for linear systems and eigenvalue problems, parallel algorithms in numerical linear algebra, and matrix methods for machine learning. His work has contributed significantly to the development of algorithms and methods in these areas, with recent research including parallel algebraic recursive multilevel solvers and multilevel graph-based methods for data exploration. Saad has a distinguished academic background with two Ph.D. degrees from the University of Grenoble, France, and a B.S. in Mathematics from the University of Algiers. Prior to his current position, he served as a senior computer scientist and associate professor at the University of Illinois at Urbana-Champaign and as a senior scientist at the Research Institute for Advanced Computer Science. Saad has received numerous awards, including the 2023 SIAM John von Neumann Prize, and has been recognized as a Fellow of the AAAS and SIAM. His contributions to the field are reflected in his extensive publication record and his leadership in advancing large-scale computing and numerical methods.
Research topics
- Mathematical analysis
- Artificial Intelligence
- Mathematics
- Computer Science
- Applied mathematics
- Geometry
- Classical mechanics
- Theoretical computer science
- Mathematical optimization
- Algorithm
Selected publications
Eigenvector-based acceleration strategies for gradient-type methods
ArXiv.org · 2026-01-16
articleOpen accessSenior authorSeveral strategies are described and analyzed to speed-up gradient-type methods when applied to the minimization of strictly convex quadratics and strictly convex functions. The proposed techniques focus on relaxing the traditional optimal step length associated with gradient methods, including the steepest descent (SD) and the minimal residual (MR) methods. Such a relaxation avoids the well-known negative zigzag effect and allows the iterates to move in the entire space which in turn implies that every so often the search direction approaches some eigenvector of the underlying Hessian matrix. The proposed speedups then rely on taking advantage of the properties of the Lanczos method once a search direction that approaches an eigenvector has been identified in order to accelerate the convergence towards the global minimizer. After analyzing the proposed strategies, we illustrate them on the global minimization of strictly convex functions.
Eigenvector-based acceleration strategies for gradient-type methods
arXiv (Cornell University) · 2026-01-16
preprintOpen accessSenior authorSeveral strategies are described and analyzed to speed-up gradient-type methods when applied to the minimization of strictly convex quadratics and strictly convex functions. The proposed techniques focus on relaxing the traditional optimal step length associated with gradient methods, including the steepest descent (SD) and the minimal residual (MR) methods. Such a relaxation avoids the well-known negative zigzag effect and allows the iterates to move in the entire space which in turn implies that every so often the search direction approaches some eigenvector of the underlying Hessian matrix. The proposed speedups then rely on taking advantage of the properties of the Lanczos method once a search direction that approaches an eigenvector has been identified in order to accelerate the convergence towards the global minimizer. After analyzing the proposed strategies, we illustrate them on the global minimization of strictly convex functions.
Eigenvector-based acceleration strategies for gradient-type methods
HAL (Le Centre pour la Communication Scientifique Directe) · 2026-01-16
preprintOpen accessSenior authorSeveral strategies are described and analyzed to speed-up gradient-type methods when applied to the minimization of strictly convex quadratics and strictly convex functions. The proposed techniques focus on relaxing the traditional optimal step length associated with gradient methods, including the steepest descent (SD) and the minimal residual (MR) methods. Such a relaxation avoids the well-known negative zigzag effect and allows the iterates to move in the entire space which in turn implies that every so often the search direction approaches some eigenvector of the underlying Hessian matrix. The proposed speedups then rely on taking advantage of the properties of the Lanczos method once a search direction that approaches an eigenvector has been identified in order to accelerate the convergence towards the global minimizer. After analyzing the proposed strategies, we illustrate them on the global minimization of strictly convex functions.
Acceleration methods for fixed-point iterations
Acta Numerica · 2025-07-01 · 3 citations
articleOpen access1st authorCorrespondingA pervasive approach in scientific computing is to express the solution to a given problem as the limit of a sequence of vectors or other mathematical objects. In many situations these sequences are generated by slowly converging iterative procedures, and this led practitioners to seek faster alternatives to reach the limit. ‘Acceleration techniques’ comprise a broad array of methods specifically designed with this goal in mind. They started as a means of improving the convergence of general scalar sequences by various forms of ‘extrapolation to the limit’, i.e. by extrapolating the most recent iterates to the limit via linear combinations. Extrapolation methods of this type, the best-known of which is Aitken’s delta-squared process, require only the sequence of vectors as input. However, limiting methods to use only the iterates is too restrictive. Accelerating sequences generated by fixed-point iterations by utilizing both the iterates and the fixed-point mapping itself has proved highly successful across various areas of physics. A notable example of these fixed-point accelerators (FP-accelerators) is a method developed by Donald Anderson in 1965 and now widely known as Anderson acceleration (AA). Furthermore, quasi-Newton and inexact Newton methods can also be placed in this category since they can be invoked to find limits of fixed-point iteration sequences by employing exactly the same ingredients as those of the FP-accelerators. This paper presents an overview of these methods – with an emphasis on those, such as AA, that are geared toward accelerating fixed-point iterations. We will navigate through existing variants of accelerators, their implementations and their applications, to unravel the close connections between them. These connections were often not recognized by the originators of certain methods, who sometimes stumbled on slight variations of already established ideas. Furthermore, even though new accelerators were invented in different corners of science, the underlying principles behind them are strikingly similar or identical. The plan of this article will approximately follow the historical trajectory of extrapolation and acceleration methods, beginning with a brief description of extrapolation ideas, followed by the special case of linear systems, the application to self-consistent field (SCF) iterations, and a detailed view of Anderson acceleration. The last part of the paper is concerned with more recent developments, including theoretical aspects, and a few thoughts on accelerating machine learning algorithms.
Designing Preconditioners for SGD: Local Conditioning, Noise Floors, and Basin Stability
ArXiv.org · 2025-11-24
preprintOpen accessStochastic Gradient Descent (SGD) often slows in the late stage of training due to anisotropic curvature and gradient noise. We analyze preconditioned SGD in the geometry induced by a symmetric positive definite matrix $\mathbf{M}$, deriving bounds in which both the convergence rate and the stochastic noise floor are governed by $\mathbf{M}$-dependent quantities: the rate through an effective condition number in the $\mathbf{M}$-metric, and the floor through the product of that condition number and the preconditioned noise level. For nonconvex objectives, we establish a preconditioner-dependent basin-stability guarantee: when smoothness and basin size are measured in the $\mathbf{M}$-norm, the probability that the iterates remain in a well-behaved local region admits an explicit lower bound. This perspective is particularly relevant in Scientific Machine Learning (SciML), where achieving small training loss under stochastic updates is closely tied to physical fidelity, numerical stability, and constraint satisfaction. The framework applies to both diagonal/adaptive and curvature-aware preconditioners and yields a simple design principle: choose $\mathbf{M}$ to improve local conditioning while attenuating noise. Experiments on a quadratic diagnostic and three SciML benchmarks validate the predicted rate-floor behavior.
Acceleration methods for fixed point iterations
ArXiv.org · 2025-07-15
preprintOpen access1st authorCorrespondingA pervasive approach in scientific computing is to express the solution to a given problem as the limit of a sequence of vectors or other mathematical objects. In many situations these sequences are generated by slowly converging iterative procedures and this led practitioners to seek faster alternatives to reach the limit. ``Acceleration techniques'' comprise a broad array of methods specifically designed with this goal in mind. They started as a means of improving the convergence of general scalar sequences by various forms of ``extrapolation to the limit'', i.e., by extrapolating the most recent iterates to the limit via linear combinations. Extrapolation methods of this type, the best known example of which is Aitken's Delta-squared process, require only the sequence of vectors as input. However, limiting methods to only use the iterates is too restrictive. Accelerating sequences generated by fixed-point iterations by utilizing both the iterates and the fixed-point mapping itself has proven highly successful across various areas of physics. A notable example of these Fixed-Point accelerators (FP-Accelerators) is a method developed by D. Anderson in 1965 and now widely known as Anderson Acceleration (AA). Furthermore, Quasi-Newton and Inexact Newton methods can also be placed in this category as well. This paper presents an overview of these methods -- with an emphasis on those, such as AA, that are geared toward accelerating fixed point iterations.
Coffee waste utilization as an eco-friendly disposal for pollutants removal from wastewater
مجلة البحوث التطبيقية في العلوم والإنسانيات · 2025-07-01
articleOpen accessThe discharge of wastewater containing synthetic dyes from industries, particularly the textilesector, poses significant environmental and health challenges. Waste coffee grounds wereexamined in this work as a low-cost, environmentally friendly, and sustainable reducing agentfor the removal of dyes from aqueous solutions. Coffee waste was collected and pretreatedbefore using. Two methods were used for treating simple treatment and ultrasonic treatment.The waste coffee samples were characterized using X-ray diffraction (XRD) and Fouriertransform infrared spectroscopy (FTIR). The sample catalytic activities were evaluated byreduction of Methylene blue (MB) and Remazol red (RR) dyes. According to the results, wastecoffee showed excellent removal efficiency for both MB and RR dyes. The catalytic activity ofwaste coffee materials is improved by ultrasonic treatment. Ultrasonic treatment improvescatalytic activity through raising active site dispersion, decreasing particle size, and improvingsurface characteristics. The potential of using spent coffee as a sustainable dye remediationmethod in wastewater treatment systems is highlighted by this study.
Straggler-Tolerant Stationary Methods for Linear Systems
SIAM Journal on Scientific Computing · 2025-01-17 · 2 citations
articleSenior authorMixed Precision Orthogonalization-Free Projection Methods for Eigenvalue and Singular Value Problems
ArXiv.org · 2025-05-01
preprintOpen accessMixed-precision arithmetic offers significant computational advantages for large-scale matrix computation tasks, yet preserving accuracy and stability in eigenvalue problems and the singular value decomposition (SVD) remains challenging. This paper introduces an approach that eliminates orthogonalization requirements in traditional Rayleigh-Ritz projection methods. The proposed method employs non-orthogonal bases computed at reduced precision, resulting in bases computed without inner-products. A primary focus is on maintaining the linear independence of the basis vectors. Through extensive evaluation with both synthetic test cases and real-world applications, we demonstrate that the proposed approach achieves the desired accuracy while fully taking advantage of mixed-precision arithmetic.
Deep learning, transformers and graph neural networks: a linear algebra perspective
Numerical Algorithms · 2025-10-16 · 2 citations
articleOpen accessSenior authorCorrespondingAbstract In an age where Artificial Intelligence (AI) is being integrated into nearly every domain of science and engineering, it has become essential for experts in Numerical Linear Algebra to explore the foundational elements of deep learning and identify ways to contribute to its development. What’s particularly exciting is that Numerical Linear Algebra (NLA) lies at the heart of Machine Learning and more broadly AI. All AI techniques fundamentally rely on four core components: data, optimization methods, statistical intuition, and linear algebra. The initial phase of any neural network model involves transforming the problem into one that can be tackled using numerical methods, particularly through optimization techniques. Thus, in Large Language Models (LLMs) this first step involves mapping words or subwords into tokens, which are then embedded into Euclidean spaces. From that point, LLMs rely heavily on vectors, matrices, and tensors. The aim of this article is to outline the essential components of deep learning methods from a linear algebra perspective. It will cover deep neural networks, multilayer perceptrons, and the concept of “attention,” which plays a crucial role in large language models as well as other machine learning applications. A significant portion of the discussion will focus on methods that leverage graphs in neural networks, such as Graph Convolutional Networks. The paper will conclude with reflections on the future role of numerical linear algebra in the age of AI.
Recent grants
Numerical Linear Algebra and Approximation Theory Methods for Efficient Data Exploration
NSF · $272k · 2005–2009
Advances in Robust Multilevel Preconditioning Methods for Sparse Linear Systems
NSF · $300k · 2019–2023
NSF · $346k · 2009–2013
Advances in Robust Multilevel Preconditioning Methods for Sparse Linear Systems
NSF · $266k · 2015–2019
Multilevel Graph-Based Methods for Efficient Data Exploration
NSF · $244k · 2020–2023
Frequent coauthors
- 50 shared
James R. Chelikowsky
The University of Texas at Austin
- 39 shared
Yuanzhe Xi
Emory University
- 27 shared
Ruipeng Li
- 25 shared
Shashanka Ubaru
- 21 shared
Efstratios Gallopoulos
University of Patras
- 20 shared
Kesheng Wu
- 19 shared
Pascal Hénon
- 17 shared
Martin H. Schultz
Labs
Awards & honors
- SIAM John von Neumann Prize (2023)
- William Norris Land Grant Chair in Large-Scale Computing (20…
- American Association for the Advancement of Science (AAAS) F…
- SIAM Fellows Program (2010)
- CSE Distinguished Professor (2005)
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Yousef Saad
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup