Yousef Saad

Verified

University of Minnesota · Computer Science and Engineering

Active 1974–2026

h-index81

Citations51.4k

Papers51040 last 5y

Funding$3.2M

Faculty page Lab page

See your match with Yousef Saad — sign in to PhdFit.Sign in

About

Yousef Saad is a Professor in the Department of Computer Science & Engineering at the University of Minnesota, where he has been a faculty member since 1990. He holds the title of CSE Distinguished Professor and the William Norris Land Grant Chair in Large-Scale Computing. Saad's research focuses on numerical linear algebra, sparse matrix computations, iterative methods for linear systems and eigenvalue problems, parallel algorithms in numerical linear algebra, and matrix methods for machine learning. His work has contributed significantly to the development of algorithms and methods in these areas, with recent research including parallel algebraic recursive multilevel solvers and multilevel graph-based methods for data exploration. Saad has a distinguished academic background with two Ph.D. degrees from the University of Grenoble, France, and a B.S. in Mathematics from the University of Algiers. Prior to his current position, he served as a senior computer scientist and associate professor at the University of Illinois at Urbana-Champaign and as a senior scientist at the Research Institute for Advanced Computer Science. Saad has received numerous awards, including the 2023 SIAM John von Neumann Prize, and has been recognized as a Fellow of the AAAS and SIAM. His contributions to the field are reflected in his extensive publication record and his leadership in advancing large-scale computing and numerical methods.

Research topics

Mathematical analysis
Artificial Intelligence
Mathematics
Computer Science
Applied mathematics
Geometry
Classical mechanics
Theoretical computer science
Mathematical optimization
Algorithm

Selected publications

Eigenvector-based acceleration strategies for gradient-type methods
ArXiv.org · 2026-01-16
articleOpen accessSenior author
Several strategies are described and analyzed to speed-up gradient-type methods when applied to the minimization of strictly convex quadratics and strictly convex functions. The proposed techniques focus on relaxing the traditional optimal step length associated with gradient methods, including the steepest descent (SD) and the minimal residual (MR) methods. Such a relaxation avoids the well-known negative zigzag effect and allows the iterates to move in the entire space which in turn implies that every so often the search direction approaches some eigenvector of the underlying Hessian matrix. The proposed speedups then rely on taking advantage of the properties of the Lanczos method once a search direction that approaches an eigenvector has been identified in order to accelerate the convergence towards the global minimizer. After analyzing the proposed strategies, we illustrate them on the global minimization of strictly convex functions.
Publisher OA PDF
Eigenvector-based acceleration strategies for gradient-type methods
arXiv (Cornell University) · 2026-01-16
preprintOpen accessSenior author
Several strategies are described and analyzed to speed-up gradient-type methods when applied to the minimization of strictly convex quadratics and strictly convex functions. The proposed techniques focus on relaxing the traditional optimal step length associated with gradient methods, including the steepest descent (SD) and the minimal residual (MR) methods. Such a relaxation avoids the well-known negative zigzag effect and allows the iterates to move in the entire space which in turn implies that every so often the search direction approaches some eigenvector of the underlying Hessian matrix. The proposed speedups then rely on taking advantage of the properties of the Lanczos method once a search direction that approaches an eigenvector has been identified in order to accelerate the convergence towards the global minimizer. After analyzing the proposed strategies, we illustrate them on the global minimization of strictly convex functions.
Publisher DOI
Eigenvector-based acceleration strategies for gradient-type methods
HAL (Le Centre pour la Communication Scientifique Directe) · 2026-01-16
preprintOpen accessSenior author
Several strategies are described and analyzed to speed-up gradient-type methods when applied to the minimization of strictly convex quadratics and strictly convex functions. The proposed techniques focus on relaxing the traditional optimal step length associated with gradient methods, including the steepest descent (SD) and the minimal residual (MR) methods. Such a relaxation avoids the well-known negative zigzag effect and allows the iterates to move in the entire space which in turn implies that every so often the search direction approaches some eigenvector of the underlying Hessian matrix. The proposed speedups then rely on taking advantage of the properties of the Lanczos method once a search direction that approaches an eigenvector has been identified in order to accelerate the convergence towards the global minimizer. After analyzing the proposed strategies, we illustrate them on the global minimization of strictly convex functions.
Publisher OA PDF
Acceleration methods for fixed-point iterations
Acta Numerica · 2025-07-01 · 3 citations
articleOpen access1st authorCorresponding
A pervasive approach in scientific computing is to express the solution to a given problem as the limit of a sequence of vectors or other mathematical objects. In many situations these sequences are generated by slowly converging iterative procedures, and this led practitioners to seek faster alternatives to reach the limit. ‘Acceleration techniques’ comprise a broad array of methods specifically designed with this goal in mind. They started as a means of improving the convergence of general scalar sequences by various forms of ‘extrapolation to the limit’, i.e. by extrapolating the most recent iterates to the limit via linear combinations. Extrapolation methods of this type, the best-known of which is Aitken’s delta-squared process, require only the sequence of vectors as input. However, limiting methods to use only the iterates is too restrictive. Accelerating sequences generated by fixed-point iterations by utilizing both the iterates and the fixed-point mapping itself has proved highly successful across various areas of physics. A notable example of these fixed-point accelerators (FP-accelerators) is a method developed by Donald Anderson in 1965 and now widely known as Anderson acceleration (AA). Furthermore, quasi-Newton and inexact Newton methods can also be placed in this category since they can be invoked to find limits of fixed-point iteration sequences by employing exactly the same ingredients as those of the FP-accelerators. This paper presents an overview of these methods – with an emphasis on those, such as AA, that are geared toward accelerating fixed-point iterations. We will navigate through existing variants of accelerators, their implementations and their applications, to unravel the close connections between them. These connections were often not recognized by the originators of certain methods, who sometimes stumbled on slight variations of already established ideas. Furthermore, even though new accelerators were invented in different corners of science, the underlying principles behind them are strikingly similar or identical. The plan of this article will approximately follow the historical trajectory of extrapolation and acceleration methods, beginning with a brief description of extrapolation ideas, followed by the special case of linear systems, the application to self-consistent field (SCF) iterations, and a detailed view of Anderson acceleration. The last part of the paper is concerned with more recent developments, including theoretical aspects, and a few thoughts on accelerating machine learning algorithms.
Publisher OA PDF DOI
Designing Preconditioners for SGD: Local Conditioning, Noise Floors, and Basin Stability
ArXiv.org · 2025-11-24
preprintOpen access
Stochastic Gradient Descent (SGD) often slows in the late stage of training due to anisotropic curvature and gradient noise. We analyze preconditioned SGD in the geometry induced by a symmetric positive definite matrix $\mathbf{M}$, deriving bounds in which both the convergence rate and the stochastic noise floor are governed by $\mathbf{M}$-dependent quantities: the rate through an effective condition number in the $\mathbf{M}$-metric, and the floor through the product of that condition number and the preconditioned noise level. For nonconvex objectives, we establish a preconditioner-dependent basin-stability guarantee: when smoothness and basin size are measured in the $\mathbf{M}$-norm, the probability that the iterates remain in a well-behaved local region admits an explicit lower bound. This perspective is particularly relevant in Scientific Machine Learning (SciML), where achieving small training loss under stochastic updates is closely tied to physical fidelity, numerical stability, and constraint satisfaction. The framework applies to both diagonal/adaptive and curvature-aware preconditioners and yields a simple design principle: choose $\mathbf{M}$ to improve local conditioning while attenuating noise. Experiments on a quadratic diagnostic and three SciML benchmarks validate the predicted rate-floor behavior.
Publisher OA PDF DOI
Acceleration methods for fixed point iterations
ArXiv.org · 2025-07-15
preprintOpen access1st authorCorresponding
A pervasive approach in scientific computing is to express the solution to a given problem as the limit of a sequence of vectors or other mathematical objects. In many situations these sequences are generated by slowly converging iterative procedures and this led practitioners to seek faster alternatives to reach the limit. ``Acceleration techniques'' comprise a broad array of methods specifically designed with this goal in mind. They started as a means of improving the convergence of general scalar sequences by various forms of ``extrapolation to the limit'', i.e., by extrapolating the most recent iterates to the limit via linear combinations. Extrapolation methods of this type, the best known example of which is Aitken's Delta-squared process, require only the sequence of vectors as input. However, limiting methods to only use the iterates is too restrictive. Accelerating sequences generated by fixed-point iterations by utilizing both the iterates and the fixed-point mapping itself has proven highly successful across various areas of physics. A notable example of these Fixed-Point accelerators (FP-Accelerators) is a method developed by D. Anderson in 1965 and now widely known as Anderson Acceleration (AA). Furthermore, Quasi-Newton and Inexact Newton methods can also be placed in this category as well. This paper presents an overview of these methods -- with an emphasis on those, such as AA, that are geared toward accelerating fixed point iterations.
Publisher OA PDF DOI
Coffee waste utilization as an eco-friendly disposal for pollutants removal from wastewater
مجلة البحوث التطبيقية في العلوم والإنسانيات · 2025-07-01
articleOpen access
The discharge of wastewater containing synthetic dyes from industries, particularly the textilesector, poses significant environmental and health challenges. Waste coffee grounds wereexamined in this work as a low-cost, environmentally friendly, and sustainable reducing agentfor the removal of dyes from aqueous solutions. Coffee waste was collected and pretreatedbefore using. Two methods were used for treating simple treatment and ultrasonic treatment.The waste coffee samples were characterized using X-ray diffraction (XRD) and Fouriertransform infrared spectroscopy (FTIR). The sample catalytic activities were evaluated byreduction of Methylene blue (MB) and Remazol red (RR) dyes. According to the results, wastecoffee showed excellent removal efficiency for both MB and RR dyes. The catalytic activity ofwaste coffee materials is improved by ultrasonic treatment. Ultrasonic treatment improvescatalytic activity through raising active site dispersion, decreasing particle size, and improvingsurface characteristics. The potential of using spent coffee as a sustainable dye remediationmethod in wastewater treatment systems is highlighted by this study.
Publisher OA PDF DOI
Straggler-Tolerant Stationary Methods for Linear Systems
SIAM Journal on Scientific Computing · 2025-01-17 · 2 citations
articleSenior author
Publisher DOI
Mixed Precision Orthogonalization-Free Projection Methods for Eigenvalue and Singular Value Problems
ArXiv.org · 2025-05-01
preprintOpen access
Mixed-precision arithmetic offers significant computational advantages for large-scale matrix computation tasks, yet preserving accuracy and stability in eigenvalue problems and the singular value decomposition (SVD) remains challenging. This paper introduces an approach that eliminates orthogonalization requirements in traditional Rayleigh-Ritz projection methods. The proposed method employs non-orthogonal bases computed at reduced precision, resulting in bases computed without inner-products. A primary focus is on maintaining the linear independence of the basis vectors. Through extensive evaluation with both synthetic test cases and real-world applications, we demonstrate that the proposed approach achieves the desired accuracy while fully taking advantage of mixed-precision arithmetic.
Publisher OA PDF DOI
Deep learning, transformers and graph neural networks: a linear algebra perspective
Numerical Algorithms · 2025-10-16 · 2 citations
articleOpen accessSenior authorCorresponding
Abstract In an age where Artificial Intelligence (AI) is being integrated into nearly every domain of science and engineering, it has become essential for experts in Numerical Linear Algebra to explore the foundational elements of deep learning and identify ways to contribute to its development. What’s particularly exciting is that Numerical Linear Algebra (NLA) lies at the heart of Machine Learning and more broadly AI. All AI techniques fundamentally rely on four core components: data, optimization methods, statistical intuition, and linear algebra. The initial phase of any neural network model involves transforming the problem into one that can be tackled using numerical methods, particularly through optimization techniques. Thus, in Large Language Models (LLMs) this first step involves mapping words or subwords into tokens, which are then embedded into Euclidean spaces. From that point, LLMs rely heavily on vectors, matrices, and tensors. The aim of this article is to outline the essential components of deep learning methods from a linear algebra perspective. It will cover deep neural networks, multilayer perceptrons, and the concept of “attention,” which plays a crucial role in large language models as well as other machine learning applications. A significant portion of the discussion will focus on methods that leverage graphs in neural networks, such as Graph Convolutional Networks. The paper will conclude with reflections on the future role of numerical linear algebra in the age of AI.
Publisher OA PDF DOI

Recent grants

Numerical Linear Algebra and Approximation Theory Methods for Efficient Data Exploration
NSF · $272k · 2005–2009
Advances in Robust Multilevel Preconditioning Methods for Sparse Linear Systems
NSF · $300k · 2019–2023
CDI Type I: Collaborative research: Materials Informatics: Computational tools for discovery and design
NSF · $346k · 2009–2013
Advances in Robust Multilevel Preconditioning Methods for Sparse Linear Systems
NSF · $266k · 2015–2019
Multilevel Graph-Based Methods for Efficient Data Exploration
NSF · $244k · 2020–2023

Frequent coauthors

James R. Chelikowsky
The University of Texas at Austin
50 shared
Yuanzhe Xi
Emory University
39 shared
Ruipeng Li
27 shared
Shashanka Ubaru
25 shared
Efstratios Gallopoulos
University of Patras
21 shared
Kesheng Wu
20 shared
Pascal Hénon
19 shared
Martin H. Schultz
17 shared

Labs

Department of Computer Science & EngineeringPI

Awards & honors

SIAM John von Neumann Prize (2023)
William Norris Land Grant Chair in Large-Scale Computing (20…
American Association for the Advancement of Science (AAAS) F…
SIAM Fellows Program (2010)
CSE Distinguished Professor (2005)

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Yousef Saad

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you