Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Rezaul Chowdhury

Rezaul Chowdhury

· Research Assistant ProfessorVerified

Stony Brook University · Computer Science

Active 1998–2025

h-index17
Citations1.4k
Papers9924 last 5y
Funding$709k
See your match with Rezaul Chowdhury — sign in to PhdFit.Sign in

About

Rezaul Chowdhury is an Associate Professor in the Department of Computer Science at Stony Brook University. He received his Ph.D. from the Department of Computer Sciences at UT Austin, working with Professor Vijaya Ramachandran, with a dissertation titled "Cache-efficient Algorithms and Data Structures: Theory and Experimental Evaluation." Prior to joining Stony Brook in 2011, he worked in Boston with Professor Sandor Vajda's Structural Bioinformatics Group at Boston University and with Professor Charles Leiserson's SuperTech Research Group at MIT. He was also a postdoctoral fellow at the Center for Computational Visualization (CVC) at the Institute for Computational Engineering & Sciences (ICES) at the University of Texas at Austin, working with Professor Chandrajit Bajaj. Chowdhury leads the Theoretical and Experimental Algorithmics (TEA) Group, focusing on both algorithm design and engineering, and holds a joint appointment with the Institute for Advanced Computational Sciences (IACS). His research interests include algorithms and data structures for efficient serial and parallel computations, cache- and I/O-efficient computing, computational biology and bioinformatics, and experimental algorithmics. His current projects involve developing a stencil computation compiler called "Pochoir" to improve cache efficiency in multicore processors, as well as work on resource-oblivious algorithms, energetics computation, and protein-protein docking. Chowdhury has received the NSF CAREER Award and a best paper award at IPDPS 2010 for his work on multicore-oblivious algorithms. He is also interested in programming contests, having won an ACM ICPC Regional Contest as a student, and has contributed contest problems to a training manual.

Research topics

  • Computer Science
  • Parallel computing
  • Algorithm
  • Artificial Intelligence
  • Mathematics
  • Theoretical computer science
  • Computational science
  • Programming language
  • Geometry
  • Database

Selected publications

  • Applying Fast Fourier Transforms to Accelerate Spatially and Temporally Inhomogeneous Stencil Computations

    2025-07-16

    articleOpen access

    Stencil computations are essential for simulating the evolution of physical systems across multi-dimensional grids over multiple timesteps. State-of-the-art techniques in this field fall into three major groups: tiled looping algorithms, divide-and-conquer trapezoidal algorithms, and Krylov subspace methods.

  • Vantage Point Selection Algorithms for Bottleneck Capacity Estimation

    ArXiv.org · 2025-01-01

    preprintOpen access

    Motivated by the problem of estimating bottleneck capacities on the Internet, we formulate and study the problem of vantage point selection. We are given a graph $G=(V, E)$ whose edges $E$ have unknown capacity values that are to be discovered. Probes from a vantage point, i.e, a vertex $v \in V$, along shortest paths from $v$ to all other vertices, reveal bottleneck edge capacities along each path. Our goal is to select $k$ vantage points from $V$ that reveal the maximum number of bottleneck edge capacities. We consider both a non-adaptive setting where all $k$ vantage points are selected before any bottleneck capacity is revealed, and an adaptive setting where each vantage point selection instantly reveals bottleneck capacities along all shortest paths starting from that point. In the non-adaptive setting, by considering a relaxed model where edge capacities are drawn from a random permutation (which still leaves the problem of maximizing the expected number of revealed edges NP-hard), we are able to give a $1-1/e$ approximate algorithm. In the adaptive setting we work with the least permissive model where edge capacities are arbitrarily fixed but unknown. We compare with the best solution for the particular input instance (i.e. by enumerating all choices of $k$ tuples), and provide both lower bounds on instance optimal approximation algorithms and upper bounds for trees and planar graphs.

  • Speeding up Stencil Computation using Gaussian Approximations

    Society for Industrial and Applied Mathematics eBooks · 2025-01-01 · 1 citations

    book-chapter

    Stencils are widely used in scientific and industrial computing for the simulation of physical systems. Given a multidimensional spatial grid containing initial data, these stencil patterns are applied uniformly to all cells of the grid over multiple timesteps to obtain the final data.

  • Speeding up Stencil Computation using Gaussian Approximations

    Society for Industrial and Applied Mathematics eBooks · 2025-01-01

    book-chapter

    Stencils are widely used in scientific and industrial computing for the simulation of physical systems. Given a multidimensional spatial grid containing initial data, these stencil patterns are applied uniformly to all cells of the grid over multiple timesteps to obtain the final data.

  • Fast American Option Pricing using Nonlinear Stencils

    2024-02-20 · 2 citations

    article

    We study the binomial, trinomial, and Black-Scholes-Merton models of option pricing. We present fast parallel discrete-time finite-difference algorithms for American call option pricing under the binomial and trinomial models and American put option pricing under the Black-Scholes-Merton model. For T-step finite differences, each algorithm runs in O (T log2 T)/p + T) time under a greedy scheduler on p processing cores, which is a significant improvement over the Θ (T2/p) + Ω (T log T) time taken by the corresponding state-of-the-art parallel algorithm. Even when run on a single core, the O (T log2 T) time taken by our algorithms is asymptotically much smaller than the Θ (T2) running time of the fastest known serial algorithms. Implementations of our algorithms significantly outperform the fastest implementations of existing algorithms in practice, e.g., when run for T ≈ 1000 steps on a 48-core machine, our algorithm for the binomial model runs at least 15× faster than the fastest existing parallel program for the same model with the speedup factor gradually reaching beyond 500× for T ≈ 0.5 × 106. It saves more than 80% energy when T ≈ 4000, and more than 99% energy for T > 60,000.

  • Cache-Oblivious Parallel Convex Hull in the Binary Forking Model

    arXiv (Cornell University) · 2023-05-17

    preprintOpen access

    We present two cache-oblivious sorting-based convex hull algorithms in the Binary Forking Model. The first is an algorithm for a presorted set of points which achieves $O(n)$ work, $O(\log n)$ span, and $O(n/B)$ serial cache complexity, where $B$ is the cache line size. These are all optimal worst-case bounds for cache-oblivious algorithms in the Binary Forking Model. The second adapts Cole and Ramachandran's cache-oblivious sorting algorithm, matching its properties including achieving $O(n \log n)$ work, $O(\log n \log \log n)$ span, and $O(n/B \log_M n)$ serial cache complexity. Here $M$ is the size of the private cache.

  • A Fast Algorithm for Aperiodic Linear Stencil Computation using Fast Fourier Transforms

    ACM Transactions on Parallel Computing · 2023-07-24 · 2 citations

    article

    Stencil computations are widely used to simulate the change of state of physical systems across a multidimensional grid over multiple timesteps. The state-of-the-art techniques in this area fall into three groups: cache-aware tiled looping algorithms, cache-oblivious divide-and-conquer trapezoidal algorithms, and Krylov subspace methods. In this article, we present two efficient parallel algorithms for performing linear stencil computations. Current direct solvers in this domain are computationally inefficient, and Krylov methods require manual labor and mathematical training. We solve these problems for linear stencils by using discrete Fourier transforms preconditioning on a Krylov method to achieve a direct solver that is both fast and general. Indeed, while all currently available algorithms for solving general linear stencils perform Θ( NT ) work, where N is the size of the spatial grid and T is the number of timesteps, our algorithms perform o ( NT ) work. To the best of our knowledge, we give the first algorithms that use fast Fourier transforms to compute final grid data by evolving the initial data for many timesteps at once. Our algorithms handle both periodic and aperiodic boundary conditions and achieve polynomially better performance bounds (i.e., computational complexity and parallel runtime) than all other existing solutions. Initial experimental results show that implementations of our algorithms that evolve grids of roughly 10 7 cells for around 10 5 timesteps run orders of magnitude faster than state-of-the-art implementations for periodic stencil problems, and 1.3× to 8.5× faster for aperiodic stencil problems. Code Repository: https://github.com/TEAlab/FFTStencils

  • Fair subgraph selection for contagion containment (Brief Announcement)

    Procedia Computer Science · 2023-01-01

    articleOpen access

    We present a new class of problems where the goal is to select a “fair” subgraph H of a given graph G = (V,E), such that H decomposes into many small components. A subgraph H c G is (P,d) fair if every vertex v ϵ P has the same degree d in H, where P c V and d > 0 are input parameters. These problems arise when the goal is to allow individuals to equally participate in activities in such a way that the connected components within an interaction graph, which models potential interactions among people, are of the smallest possible size, so that the spread of the contagion, and the difficulty of contact tracing in case of infection, is minimized. Within a preference graph that models the set of preferred choices for each individual when selecting among available options of where to conduct any particular type of activity (e.g., which gym to attend), we seek to compute the fair subgraph of assignments of individuals to these options, so that the number of people in each connected component (“interaction community”) of the resulting subgraph is minimized, and everyone is given the same number of options for every activity. We show that the fair subgraph selection problem is NP-hard, even for very restricted versions. We then formulate the problem as an integer program, and give a polynomial time computable lower bound on the optimal solution.

  • Fast American Option Pricing using Nonlinear Stencils

    arXiv (Cornell University) · 2023-03-04

    preprintOpen access

    We study the binomial, trinomial, and Black-Scholes-Merton models of option pricing. We present fast parallel discrete-time finite-difference algorithms for American call option pricing under the binomial and trinomial models and American put option pricing under the Black-Scholes-Merton model. For $T$-step finite differences, each algorithm runs in $O(\left(T\log^2{T}\right)/p + T)$ time under a greedy scheduler on $p$ processing cores, which is a significant improvement over the $Θ({T^2}/{p}) + Ω(T\log{T})$ time taken by the corresponding state-of-the-art parallel algorithm. Even when run on a single core, the $O(T\log^2{T})$ time taken by our algorithms is asymptotically much smaller than the $Θ(T^2)$ running time of the fastest known serial algorithms. Implementations of our algorithms significantly outperform the fastest implementations of existing algorithms in practice, e.g., when run for $T \approx 1000$ steps on a 48-core machine, our algorithm for the binomial model runs at least $15\times$ faster than the fastest existing parallel program for the same model with the speed-up factor gradually reaching beyond $500\times$ for $T \approx 0.5 \times 10^6$. It saves more than 80\% energy when $T \approx 4000$, and more than 99\% energy for $T > 60,000$. Our option pricing algorithms can be viewed as solving a class of nonlinear 1D stencil (i.e., finite-difference) computation problems efficiently using the Fast Fourier Transform (FFT). To our knowledge, ours are the first algorithms to handle such stencils in $o(T^2)$ time. These contributions are of independent interest as stencil computations have a wide range of applications beyond quantitative finance.

  • Brief Announcement

    2022-07-10 · 4 citations

    article

    Stencil computations are widely used to simulate the change of state of physical systems. The current best algorithm for performing aperiodic linear stencil computations on a d (≥ 1)-dimensional grid of size N for T timesteps does Θ(TN1-1/d+N Log N) work. We introduce novel techniques based on random walks and Gaussian approximations for an asymptotic improvement of this work bound for a class of linear stencils. We also improve the span (i.e., parallel running time on an unbounded number of processors) asymptotically from the current state of the art.

Recent grants

Frequent coauthors

Labs

Awards & honors

  • NSF CAREER Award
  • best paper award in IPDPS 2010
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Rezaul Chowdhury

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup