Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Shiqian Ma

Shiqian Ma

· Professor of Computational Applied Mathematics and Operations ResearchVerified

Rice University · Computing and Mathematical Sciences

Active 2006–2026

h-index38
Citations5.4k
Papers20283 last 5y
Funding$1.4M2 active
See your match with Shiqian Ma — sign in to PhdFit.Sign in

About

Shiqian Ma is a Professor of Computational Applied Mathematics and Operations Research at Rice University. His research areas include optimization and machine learning. He holds a Ph.D. from the Department of Industrial Engineering and Operations Research at Columbia University, obtained in 2011. Ma also earned a Master's degree from the Institute of Computational Mathematics and Scientific/Engineering Computing at the Chinese Academy of Sciences in 2006, and a Bachelor's degree from the School of Mathematical Sciences at Peking University in 2003. He is a member of the Ken Kennedy Institute and is involved in teaching operations research, optimization, and machine learning.

Research topics

  • Artificial Intelligence
  • Computer Science
  • Mathematics
  • Mathematical optimization
  • Algorithm
  • Applied mathematics
  • Pure mathematics
  • Mathematical analysis
  • Combinatorics

Selected publications

  • Demystifying Manifold Constraints in LLM Pre-training

    ArXiv.org · 2026-05-06

    articleOpen accessSenior author

    The empirical success of large language model (LLM) pre-training relies heavily on heuristic stabilization techniques, such as explicit normalization layers and weight decay. While recent constrained optimization approaches that explicitly restrict weights may improve numerical stability and performance, the mechanism and motivation for adding constraints still remain elusive. This paper systematically demystifies the role of explicit manifold constraints in LLM pre-training. By introducing the Msign-Aligned Constrained Riemannian Optimizer (MACRO)-a provably convergent, single-loop optimization framework-our study disentangles weight regularization heuristics from interacting mechanisms like RMS normalization and decoupled weight decay. Theoretical analyses and comprehensive empirical evaluations reveal that manifold constraints independently bound forward activation scales and enforce stable rotational equilibrium, thereby subsuming the roles of these heuristic mechanisms. Evaluations on large-scale LLM architectures demonstrate that MACRO achieves highly competitive performance while rigorously preserving the theoretical guarantees of exact Riemannian optimization.

  • Fast Sparse Nonnegative Matrix Factorization with Manifold Acceleration

    2026-04-21

    articleSenior author

    In this paper, we propose a fast sparse Nonnegative Matrix Factorization algorithm incorporating manifold identification techniques. Within an alternating update framework, it adaptively leverages the algorithm’s inherent manifold identification information to accelerate subproblem solutions, thereby enhancing computational efficiency. Numerical experiments demonstrate that our algorithm shows superior performance compared to existing methods, achieving better solutions with faster convergence rates, particularly under high sparsity requirements. We provide a global convergence guarantee for the algorithm. Regarding the locally linear convergence observed experimentally, under a set of assumptions, we develop a proof strategy for general cases. Furthermore, we furnish a complete proof for the vector case.

  • Demystifying Manifold Constraints in LLM Pre-training

    arXiv (Cornell University) · 2026-05-06

    preprintOpen accessSenior author

    The empirical success of large language model (LLM) pre-training relies heavily on heuristic stabilization techniques, such as explicit normalization layers and weight decay. While recent constrained optimization approaches that explicitly restrict weights may improve numerical stability and performance, the mechanism and motivation for adding constraints still remain elusive. This paper systematically demystifies the role of explicit manifold constraints in LLM pre-training. By introducing the Msign-Aligned Constrained Riemannian Optimizer (MACRO)-a provably convergent, single-loop optimization framework-our study disentangles weight regularization heuristics from interacting mechanisms like RMS normalization and decoupled weight decay. Theoretical analyses and comprehensive empirical evaluations reveal that manifold constraints independently bound forward activation scales and enforce stable rotational equilibrium, thereby subsuming the roles of these heuristic mechanisms. Evaluations on large-scale LLM architectures demonstrate that MACRO achieves highly competitive performance while rigorously preserving the theoretical guarantees of exact Riemannian optimization.

  • AutoBalance: An Automatic Balancing Framework for Training Physics-Informed Neural Networks

    ArXiv.org · 2025-10-08

    preprintOpen accessSenior author

    Physics-Informed Neural Networks (PINNs) provide a powerful and general framework for solving Partial Differential Equations (PDEs) by embedding physical laws into loss functions. However, training PINNs is notoriously difficult due to the need to balance multiple loss terms, such as PDE residuals and boundary conditions, which often have conflicting objectives and vastly different curvatures. Existing methods address this issue by manipulating gradients before optimization (a "pre-combine" strategy). We argue that this approach is fundamentally limited, as forcing a single optimizer to process gradients from spectrally heterogeneous loss landscapes disrupts its internal preconditioning. In this work, we introduce AutoBalance, a novel "post-combine" training paradigm. AutoBalance assigns an independent adaptive optimizer to each loss component and aggregates the resulting preconditioned updates afterwards. Extensive experiments on challenging PDE benchmarks show that AutoBalance consistently outperforms existing frameworks, achieving significant reductions in solution error, as measured by both the MSE and $L^{\infty}$ norms. Moreover, AutoBalance is orthogonal to and complementary with other popular PINN methodologies, amplifying their effectiveness on demanding benchmarks.

  • First-Order Federated Bilevel Learning

    Proceedings of the AAAI Conference on Artificial Intelligence · 2025-04-11

    articleOpen access

    Federated bilevel optimization (FBO) has garnered significant attention lately, driven by its promising applications in meta-learning and hyperparameter optimization. Existing algorithms generally aim to approximate the gradient of the upper-level objective function (hypergradient) in the federated setting. However, because of the nonlinearity of the hypergradient and client drift, they often involve complicated computations. These computations, like multiple optimization sub-loops and second-order derivative evaluations, end up with significant memory consumption and high computational costs. In this paper, we propose a computationally and memory-efficient FBO algorithm named MemFBO. MemFBO features a fully single-loop structure with all involved variables updated simultaneously, and uses only first-order gradient information for all local updates. We show that MemFBO exhibits a linear convergence speedup with milder assumptions in both partial and full client participation scenarios. We further implement MemFBO in a novel FBO application for federated data cleaning. Our experiments, conducted on this application and federated hyper-representation, demonstrate the effectiveness of the proposed algorithm.

  • Efficient OPF calculations for power system reliability assessment based on state similarity

    Applied Energy · 2025-11-24

    articleOpen access
  • AdaBB: Adaptive Barzilai-Borwein Method for Convex Optimization

    Mathematics of Operations Research · 2025-03-31 · 2 citations

    article

    In this paper, we propose AdaBB, an adaptive gradient method based on the Barzilai-Borwein stepsize. The algorithm is line-search-free and parameter-free, and it essentially provides a convergent variant of the Barzilai-Borwein method for general convex optimization problems. We analyze the ergodic convergence of the objective function value and the convergence of the iterates for solving general convex optimization problems. Compared with existing works along this line of research, our algorithm gives the best lower bounds on the stepsize and the average of the stepsizes. Furthermore, we present extensions of the proposed algorithm for solving locally strongly convex and composite convex optimization problems where the objective function is the sum of a smooth function and a nonsmooth function. In the case of local strong convexity, we achieve linear convergence. Our numerical results also demonstrate very promising potential of the proposed algorithms on some representative examples. Funding: S. Ma is supported by the National Science Foundation [Grants DMS-2243650, CCF-2308597, CCF-2311275, and ECCS-2326591] and a startup fund from Rice University. J. Yang is supported by the National Natural Science Foundation of China [Grants 12431011 and 12371301] and the Natural Science Foundation for Distinguished Young Scholars of Gansu Province [Grant 22JR5RA223].

  • On the Convergence of Constrained Gradient Method

    ArXiv.org · 2025-11-21

    preprintOpen access

    The constrained gradient method (CGM) has recently been proposed to solve convex optimization and monotone variational inequality (VI) problems with general functional constraints. While existing literature has established convergence results for CGM, the assumptions employed therein are quite restrictive; in some cases, certain assumptions are mutually inconsistent, leading to gaps in the underlying analysis. This paper aims to derive rigorous and improved convergence guarantees for CGM under weaker and more reasonable assumptions, specifically in the context of strongly convex optimization and strongly monotone VI problems. Preliminary numerical experiments are provided to verify the validity of CGM and demonstrate its efficacy in addressing such problems.

  • Mirror Flow Matching with Heavy-Tailed Priors for Generative Modeling on Convex Domains

    ArXiv.org · 2025-10-10

    preprintOpen accessSenior author

    We study generative modeling on convex domains using flow matching and mirror maps, and identify two fundamental challenges. First, standard log-barrier mirror maps induce heavy-tailed dual distributions, leading to ill-posed dynamics. Second, coupling with Gaussian priors performs poorly when matching heavy-tailed targets. To address these issues, we propose Mirror Flow Matching based on a \emph{regularized mirror map} that controls dual tail behavior and guarantees finite moments, together with coupling to a Student-$t$ prior that aligns with heavy-tailed targets and stabilizes training. We provide theoretical guarantees, including spatial Lipschitzness and temporal regularity of the velocity field, Wasserstein convergence rates for flow matching with Student-$t$ priors and primal-space guarantees for constrained generation, under $\varepsilon$-accurate learned velocity fields. Empirically, our method outperforms baselines in synthetic convex-domain simulations and achieves competitive sample quality on real-world constrained generative tasks.

  • Relaxed Proximal Point Algorithm: Tight Complexity Bounds and Acceleration Without Momentum

    INFORMS Journal on Optimization · 2025-12-09

    articleOpen access

    In this paper, we focus on the relaxed proximal point algorithm (RPPA) for solving convex (possibly nonsmooth) optimization problems. We conduct a comprehensive study on three types of relaxation schedules: (i) constant schedule with relaxation parameter [Formula: see text], (ii) a dynamic schedule put forward by Teboulle and Vaisbourd, and (iii) the silver step-size schedule proposed by Altschuler and Parrilo. The latter two schedules were initially investigated for the gradient descent (GD) method and are extended to the RPPA in this paper. For type (i), we establish tight nonergodic [Formula: see text] convergence rate results measured by function value residual and subgradient norm, where N denotes the iteration counter. For type (ii), we establish a convergence rate that is tight and approximately [Formula: see text] times better than the constant schedule of type (i). For type (iii), aside from the original silver step-size schedule proposed previously, we propose two new modified silver step-size schedules, and for all the three silver step-size schedules, [Formula: see text] accelerated convergence rate results with respect to three different performance metrics are established. Furthermore, our research affirms a previous conjecture by Luner and Grimmer on the GD method with the original silver step-size schedule. Funding: B. Wang, J. Yang, and D. Zhou were supported by the National Natural Science Foundation of China [Grants 12431011 and 12371301] and the Key Laboratory of Numerical Simulation of Large Scale Complex Systems of the Ministry of Education of China. S. Ma was supported in part by the National Science Foundation [Grants CCF-2311275 and ECCS-2326591].

Recent grants

Frequent coauthors

  • Shuzhong Zhang

    34 shared
  • Donald Goldfarb

    22 shared
  • Shixiang Chen

    Chang'an University

    15 shared
  • Lingzhou Xue

    15 shared
  • Tianyi Lin

    Columbia University

    14 shared
  • Jiaxiang Li

    University of South China

    11 shared
  • Bo Jiang

    10 shared
  • Krishnakumar Balasubramanian

    University of California, Davis

    9 shared

Labs

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Shiqian Ma

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup