Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Zhaosong Lu

Zhaosong Lu

Verified

University of Minnesota · Industrial and Systems Engineering

Active 2004–2026

h-index31
Citations3.6k
Papers13134 last 5y
Funding$200k1 active
See your match with Zhaosong Lu — sign in to PhdFit.Sign in

About

Zhaosong Lu is a professor whose research interests include theory and algorithms for continuous optimization, with applications in data analytics, machine learning, statistics, and image processing. He has published extensively in major journals such as SIAM Journal on Optimization, Mathematical Programming, and Mathematics of Operations Research. His research has been funded by the NSF. Dr. Lu holds a Ph.D. in Operations Research from the School of Industrial and Systems Engineering at Georgia Institute of Technology.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Machine Learning
  • Mathematics
  • Pure mathematics
  • Mathematical optimization
  • Applied mathematics
  • Algorithm
  • Combinatorics

Selected publications

  • Variance-Reduced First-Order Methods for Deterministically Constrained Stochastic Nonconvex Optimization with Strong Convergence Guarantees

    SIAM Journal on Optimization · 2026-01-02

    article1st authorCorresponding

    In this paper, we study a class of deterministically constrained stochastic nonconvex optimization problems. Existing methods typically aim to find an ϵ-expectedly feasible stochastic stationary point, where the expected violations of both constraints and first-order stationarity are within a prescribed tolerance ϵ. However, in many practical applications, it is crucial that the constraints be nearly satisfied with certainty, making such an ϵ-stochastic stationary point potentially undesirable due to the risk of substantial constraint violations. To address this issue, we propose single-loop variance-reduced stochastic first-order methods, where the stochastic gradient of the stochastic component is computed using either a truncated recursive momentum scheme or a truncated Polyak momentum scheme for variance reduction, while the gradient of the deterministic component is computed exactly. Under the error bound condition with a parameter θ ≥1 and other suitable assumptions, we establish that these methods respectively achieve sample complexity and first-order oracle complexity of (Formula presented) for finding an ϵ-surely feasible stochastic stationary point (formula presented) with logarithmic factors hidden), where the constraint violation is within ϵ with certainty, and the expected violation of first-order stationarity is within ϵ. For θ =1, these complexities reduce to (formula presented), respectively, which match, up to a logarithmic factor, the best-known complexities achieved by existing methods for finding an ϵ -stochastic stationary point of unconstrained smooth stochastic nonconvex optimization problems.

  • Solving Bilevel Optimization via Sequential Minimax Optimization

    Mathematics of Operations Research · 2026-01-06

    article1st authorCorresponding

    In this paper, we propose a sequential minimax optimization (SMO) method for solving a class of constrained bilevel optimization problems in which the lower level part is a possibly nonsmooth convex optimization problem, whereas the upper level part is a possibly nonconvex optimization problem. Specifically, SMO applies a first order method to solve a sequence of minimax subproblems, which are obtained by employing a hybrid of modified augmented Lagrangian and penalty schemes on the bilevel optimization problems. Under suitable assumptions, we establish an operation complexity of [Formula: see text] and [Formula: see text], measured in terms of fundamental operations, for SMO in finding an [Formula: see text]-Karush–Kuhn–Tucker solution of the bilevel optimization problems with merely convex and strongly convex lower level objective functions, respectively. The latter result improves the previous best known operation complexity by a factor of [Formula: see text]. Preliminary numerical results demonstrate significantly superior computational performance compared with the recently developed first order penalty method. Funding: This work was partially supported by the Air Force Office of Scientific Research Award [FA9550-24-1-0343], the Office of Naval Research Award [N00014-24-1-2702], and the National Science Foundation Awards [2211491 and 2435911]. It was primarily conducted during Sanyou Mei's PhD studies at the University of Minnesota.

  • A first-order method for nonconvex-strongly-concave constrained minimax optimization

    Optimization methods & software · 2026-01-02

    article1st author

    In this paper we study a nonconvex-strongly-concave constrained minimax problem. Specifically, we propose a first-order augmented Lagrangian method for solving it, whose subproblems are nonconvex-strongly-concave unconstrained minimax problems and suitably solved by a first-order method developed in this paper that leverages the strong concavity structure. Under suitable assumptions, the proposed method achieves an operation complexity of (Formula presented.), measured in terms of its fundamental operations, for finding an ε-KKT solution of the constrained minimax problem, which improves the previous best-known operation complexity by a factor of (Formula presented.).

  • Complexity of normalized stochastic first-order methods with momentum under heavy-tailed noise

    arXiv (Cornell University) · 2025-06-12

    preprintOpen access

    In this paper, we propose practical normalized stochastic first-order methods with Polyak momentum, multi-extrapolated momentum, and recursive momentum for solving unconstrained optimization problems. These methods employ dynamically updated algorithmic parameters and do not require explicit knowledge of problem-dependent quantities such as the Lipschitz constant or noise bound. We establish first-order oracle complexity results for finding approximate stochastic stationary points under heavy-tailed noise and weakly average smoothness conditions -- both of which are weaker than the commonly used bounded variance and mean-squared smoothness assumptions. Our complexity bounds either improve upon or match the best-known results in the literature. Numerical experiments are presented to demonstrate the practical effectiveness of the proposed methods.

  • Solving bilevel optimization via sequential minimax optimization

    ArXiv.org · 2025-11-10

    preprintOpen access1st authorCorresponding

    In this paper we propose a sequential minimax optimization (SMO) method for solving a class of constrained bilevel optimization problems in which the lower-level part is a possibly nonsmooth convex optimization problem, while the upper-level part is a possibly nonconvex optimization problem. Specifically, SMO applies a first-order method to solve a sequence of minimax subproblems, which are obtained by employing a hybrid of modified augmented Lagrangian and penalty schemes on the bilevel optimization problems. Under suitable assumptions, we establish an operation complexity of $O(\varepsilon^{-7}\log\varepsilon^{-1})$ and $O(\varepsilon^{-6}\log\varepsilon^{-1})$, measured in terms of fundamental operations, for SMO in finding an $\varepsilon$-KKT solution of the bilevel optimization problems with merely convex and strongly convex lower-level objective functions, respectively. The latter result improves the previous best-known operation complexity by a factor of $\varepsilon^{-1}$. Preliminary numerical results demonstrate significantly superior computational performance compared to the recently developed first-order penalty method.

  • A first-order method for constrained nonconvex--nonconcave minimax problems under a local Kurdyka-Łojasiewicz condition

    ArXiv.org · 2025-10-01

    preprintOpen access1st authorCorresponding

    We study a class of constrained nonconvex--nonconcave minimax problems in which the inner maximization involves potentially complex constraints. Under the assumption that the inner problem of a novel lifted minimax problem satisfies a local Kurdyka-Łojasiewicz (KL) condition, we show that the maximal function of the original problem enjoys a local Hölder smoothness property. We also propose a sequential convex programming (SCP) method for solving constrained optimization problems and establish its convergence rate under a local KL condition. Leveraging these results, we develop an inexact proximal gradient method for the original minimax problem, where the inexact gradient of the maximal function is computed via the SCP method applied to a locally KL-structured subproblem. Finally, we establish complexity guarantees for the proposed method in computing an approximate stationary point of the original minimax problem.

  • Nested Stochastic Algorithm for Generalized Sinkhorn distance-Regularized Distributionally Robust Optimization

    ArXiv.org · 2025-03-29

    preprintOpen accessSenior author

    Distributionally robust optimization (DRO) is a powerful technique to train robust models against data distribution shift. This paper aims to solve regularized nonconvex DRO problems, where the uncertainty set is modeled by a so-called generalized Sinkhorn distance and the loss function is nonconvex and possibly unbounded. Such a distance allows to model uncertainty of distributions with different probability supports and divergence functions. For this class of regularized DRO problems, we derive a novel dual formulation taking the form of nested stochastic optimization, where the dual variable depends on the data sample. To solve the dual problem, we provide theoretical evidence to design a nested stochastic gradient descent (SGD) algorithm, which leverages stochastic approximation to estimate the nested stochastic gradients. We study the convergence rate of nested SGD and establish polynomial iteration and sample complexities that are independent of the data size and parameter dimension, indicating its potential for solving large-scale DRO problems. We conduct numerical experiments to demonstrate the efficiency and robustness of the proposed algorithm.

  • Newton-CG Methods for Nonconvex Unconstrained Optimization with Hölder Continuous Hessian

    Mathematics of Operations Research · 2025-05-19

    articleOpen accessSenior author

    In this paper, we consider a nonconvex unconstrained optimization problem minimizing a twice differentiable objective function with Hölder continuous Hessian. Specifically, we first propose a Newton-conjugate gradient (Newton-CG) method for finding an approximate first- and second-order stationary point of this problem, assuming the associated Hölder parameters are explicitly known. Then, we develop a parameter-free Newton-CG method without requiring any prior knowledge of these parameters. To the best of our knowledge, this method is the first parameter-free second-order method achieving the best-known iteration and operation complexity for finding an approximate first- and second-order stationary point of this problem. Finally, we present preliminary numerical results to demonstrate the superior practical performance of our parameter-free Newton-CG method over a well-known regularized Newton method. Funding: C. He was partially financially supported by the Wallenberg AI, Autonomous Systems and Software Program funded by the Knut and Alice Wallenberg Foundation. H. Huang was partially financially supported by the National Science Foundation [Award IIS-2347592]. Z. Lu was partially financially supported by the National Science Foundation [Award IIS-2211491], the Office of Naval Research [Award N00014-24-1-2702], and the Air Force Office of Scientific Research [Award FA9550-24-1-0343].

  • A first-order method for nonconvex-nonconcave minimax problems under a local Kurdyka-Lojasiewicz condition

    ArXiv.org · 2025-07-02

    preprintOpen access1st authorCorresponding

    We study a class of nonconvex-nonconcave minimax problems in which the inner maximization problem satisfies a local Kurdyka-Lojasiewicz (KL) condition that may vary with the outer minimization variable. In contrast to the global KL or Polyak-Lojasiewicz (PL) conditions commonly assumed in the literature -- which are significantly stronger and often too restrictive in practice -- this local KL condition accommodates a broader range of practical scenarios. However, it also introduces new analytical challenges. In particular, as an optimization algorithm progresses toward a stationary point of the problem, the region over which the KL condition holds may shrink, resulting in a more intricate and potentially ill-conditioned landscape. To address this challenge, we show that the associated maximal function is locally generalized Hölder smooth. Leveraging this key property, we develop an inexact proximal gradient method for solving the minimax problem, where the inexact gradient of the maximal function is computed by applying a proximal gradient method to a KL-structured subproblem. Under mild assumptions, we establish complexity guarantees for computing an approximate stationary point of the minimax problem.

  • A first-order method for nonconvex-strongly-concave constrained minimax optimization

    arXiv (Cornell University) · 2025-12-28

    preprintOpen access1st authorCorresponding

    In this paper we study a nonconvex-strongly-concave constrained minimax problem. Specifically, we propose a first-order augmented Lagrangian method for solving it, whose subproblems are nonconvex-strongly-concave unconstrained minimax problems and suitably solved by a first-order method developed in this paper that leverages the strong concavity structure. Under suitable assumptions, the proposed method achieves an operation complexity of $O(\varepsilon^{-3.5}\log\varepsilon^{-1})$, measured in terms of its fundamental operations, for finding an $\varepsilon$-KKT solution of the constrained minimax problem, which improves the previous best-known operation complexity by a factor of $\varepsilon^{-0.5}$.

Recent grants

Frequent coauthors

  • Yuru Zou

    26 shared
  • Jian Lü

    25 shared
  • Huaxuan Hu Huaxuan Hu

    Shenzhen University

    25 shared
  • Lin Li

    Zhengzhou University of Light Industry

    25 shared
  • Xiaoxia Liu

    Beibu Gulf University

    25 shared
  • Renato D. C. Monteiro

    15 shared
  • Ting Kei Pong

    14 shared
  • Jieping Ye

    9 shared
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Zhaosong Lu

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup