Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…

Aurojit Panda

· Associate Professor in Computer ScienceVerified

New York University · Computer Science

Active 2008–2026

h-index40
Citations5.3k
Papers13751 last 5y
Funding$810k1 active
See your match with Aurojit Panda — sign in to PhdFit.Sign in

About

Aurojit Panda is an associate professor in Computer Science at New York University (NYU). He earned his PhD from the University of California, Berkeley, where he was advised by Scott Shenker and conducted research in the NetSys Lab. Prior to his doctoral studies, he received a Sc.B. with honors in Math and Computer Science from Brown University. Before joining NYU, Panda worked as a software developer at Nefeli Networks, a startup specializing in network function orchestration solutions. Additionally, he spent several years working on the Midori kernel at Microsoft between his time at Brown and Berkeley. Professor Panda's research focuses on systems and networking problems, with a particular interest in improving system reliability. His work aims to identify bugs before deployment and enhance fault tolerance in systems. He is engaged in a broad range of topics within this domain and is open to collaborating with NYU undergraduates, masters, and PhD students who have relevant coursework or experience in systems or networking. His teaching includes courses such as Distributed Systems and Undergraduate Operating Systems.

Research topics

  • Computer Science
  • Operating system
  • Computer Security
  • Distributed computing
  • Software engineering
  • Embedded system
  • Programming language
  • World Wide Web

Selected publications

  • Probabilistic Fair Ordering of Events

    Open MIND · 2026-02-09

    preprint

    A growing class of applications depends on fair ordering, where events that occur earlier should be processed before later ones. Providing such guarantees is difficult in practice because clock synchronization is inherently imperfect: events generated at different clients within a short time window may carry timestamps that cannot be reliably ordered. Rather than attempting to eliminate synchronization error, we embrace it and establish a probabilistically fair sequencing process. Tommy is a sequencer that uses a statistical model of per-clock synchronization error to compare noisy timestamps probabilistically. Although this enables ordering of two events, the probabilistic comparator is intransitive, making global ordering non-trivial. We address this challenge by mapping the sequencing problem to a classical ranking problem from social choice theory, which offers principled mechanisms for reasoning with intransitive comparisons. Using this formulation, Tommy produces a partial order of events, achieving significantly better fairness than a Spanner TrueTime-based baseline approach.

  • Revisiting Speculative Leaderless Protocols for Low-Latency BFT Replication

    arXiv (Cornell University) · 2026-01-06

    preprintOpen access

    As Byzantine Fault Tolerant (BFT) protocols begin to be used in permissioned blockchains for user-facing applications such as payments, it is crucial that they provide low latency. In pursuit of low latency, some recently proposed BFT consensus protocols employ a leaderless optimistic fast path, in which clients broadcast their requests directly to replicas without first serializing requests at a leader, resulting in an end-to-end commit latency of 2 message delays ($2Δ$) during fault-free, synchronous periods. However, such a fast path only works if there is no contention: concurrent contending requests can cause replicas to diverge if they receive conflicting requests in different orders, triggering costly recovery procedures. In this work, we present Aspen, a leaderless BFT protocol that achieves a near-optimal latency of $2Δ+ \varepsilon$, where $\varepsilon$ indicates a short waiting delay. Aspen removes the no-contention condition by utilizing a best-effort sequencing layer based on loosely synchronized clocks and network delay estimates. Aspen requires $n = 3f + 2p + 1$ replicas to cope with up to $f$ Byzantine nodes. The $2p$ extra nodes allow Aspen's fast path to proceed even if up to $p$ replicas diverge due to unpredictable network delays. When its optimistic conditions do not hold, Aspen falls back to PBFT-style protocol, guaranteeing safety and liveness under partial synchrony. In experiments with wide-area distributed replicas, Aspen commits requests in less than 75 ms, a 1.2 to 3.3$\times$ improvement compared to previous protocols, while supporting 19,000 requests per second.

  • Probabilistic Fair Ordering of Events

    ArXiv.org · 2026-02-09

    articleOpen access

    A growing class of applications depends on fair ordering, where events that occur earlier should be processed before later ones. Providing such guarantees is difficult in practice because clock synchronization is inherently imperfect: events generated at different clients within a short time window may carry timestamps that cannot be reliably ordered. Rather than attempting to eliminate synchronization error, we embrace it and establish a probabilistically fair sequencing process. Tommy is a sequencer that uses a statistical model of per-clock synchronization error to compare noisy timestamps probabilistically. Although this enables ordering of two events, the probabilistic comparator is intransitive, making global ordering non-trivial. We address this challenge by mapping the sequencing problem to a classical ranking problem from social choice theory, which offers principled mechanisms for reasoning with intransitive comparisons. Using this formulation, Tommy produces a partial order of events, achieving significantly better fairness than a Spanner TrueTime-based baseline approach.

  • CLM: Removing the GPU Memory Barrier for 3D Gaussian Splatting

    2026-03-10 · 1 citations

    articleOpen accessSenior author

    3D Gaussian Splatting (3DGS) is an increasingly popular novel view synthesis approach due to its fast rendering time, and high-quality output. However, scaling 3DGS to large (or intricate) scenes is challenging due to its substantial memory requirement, which exceeds the memory capacity of most GPUs. In this paper, we describe CLM, a system that allows 3DGS to render large scenes using a single consumer-grade GPU, e.g., RTX4090. It does so by offloading Gaussians to CPU memory, and loading them into GPU memory only when necessary. To improve performance and reduce communication overheads, CLM uses a novel offloading strategy based on insights into 3DGS's memory access patterns. This strategy enables efficient pipelining, which overlaps GPU-to-CPU communication, GPU computation and CPU computation. Furthermore, CLM exploits these access patterns to reduce communication volume. Our evaluation shows that the resulting implementation can render a large scene that requires 102 million Gaussians on a single RTX4090 and achieve state-of-the-art reconstruction quality. The code is open-sourced at: https://github.com/nyu-systems/CLM-GS

  • Revisiting Speculative Leaderless Protocols for Low-Latency BFT Replication

    ArXiv.org · 2026-01-06

    articleOpen access

    As Byzantine Fault Tolerant (BFT) protocols begin to be used in permissioned blockchains for user-facing applications such as payments, it is crucial that they provide low latency. In pursuit of low latency, some recently proposed BFT consensus protocols employ a leaderless optimistic fast path, in which clients broadcast their requests directly to replicas without first serializing requests at a leader, resulting in an end-to-end commit latency of 2 message delays ($2Δ$) during fault-free, synchronous periods. However, such a fast path only works if there is no contention: concurrent contending requests can cause replicas to diverge if they receive conflicting requests in different orders, triggering costly recovery procedures. In this work, we present Aspen, a leaderless BFT protocol that achieves a near-optimal latency of $2Δ+ \varepsilon$, where $\varepsilon$ indicates a short waiting delay. Aspen removes the no-contention condition by utilizing a best-effort sequencing layer based on loosely synchronized clocks and network delay estimates. Aspen requires $n = 3f + 2p + 1$ replicas to cope with up to $f$ Byzantine nodes. The $2p$ extra nodes allow Aspen's fast path to proceed even if up to $p$ replicas diverge due to unpredictable network delays. When its optimistic conditions do not hold, Aspen falls back to PBFT-style protocol, guaranteeing safety and liveness under partial synchrony. In experiments with wide-area distributed replicas, Aspen commits requests in less than 75 ms, a 1.2 to 3.3$\times$ improvement compared to previous protocols, while supporting 19,000 requests per second.

  • Front Matter, Table of Contents, Preface, Conference Organization

    Leibniz-Zentrum für Informatik (Schloss Dagstuhl) · 2026-01-01

    articleOpen accessSenior author

    Front Matter, Table of Contents, Preface, Conference Organization

  • OASIcs, Volume 139, NINeS 2026, Complete Volume

    Leibniz-Zentrum für Informatik (Schloss Dagstuhl) · 2026-01-01

    otherOpen accessSenior author

    OASIcs, Volume 139, NINeS 2026, Complete Volume

  • Elastic Scaling of Real-Time Communication Services

    IEEE Transactions on Network and Service Management · 2026-01-01

    article

    Real-time Communications (RTC) services, including multiparty conferencing, live streaming, and cloud-gaming, rely on a large-scale media plane infrastructure that provides real-time audio/video processing to clients. Unfortunately, offthe- shelf RTC services are not elastically scalable. As a result, operators must provision media servers to meet peak demand, resulting in resource under-utilization and high cost. Given that today microservice orchestrators like Kubernetes allow web-services to scale transparently and econimically, this paper looks at applying the same approach to scale large-scale RTC services. We find that this is challenging for two reasons: (a) the default network dataplane underlying Kubernetes does not meet the compelling traffic management, performance and real-time requirements of RTC; and (b) current autoscaling policies are ill-suited to RTC. We address these challenges by designing a RTC-specific service mesh that pushes media traffic processing into the OS kernel and designing new RTC-specific Kubernetes autoscaling policies. Our evaluation on a functional VoIP test-bed shows that this combination allows to deploy elatically scalable RTC services with 100× lower-jitter and 700× lower RTT than the current state-of-the art.

  • It Takes Two to Entangle

    2026-03-10

    articleOpen accessSenior author

    Distributed machine learning training and inference is common today because today's large models require more memory and compute than can be provided by a single GPU. Distributed models are generally produced by programmers who take a sequential model specification and apply several distribution strategies to distribute state and computation across GPUs. Unfortunately, bugs can be introduced in the process, and a distributed model implementation's outputs might differ from the sequential model's outputs. In this paper, we describe an approach to statically identify such bugs by checking model refinement, that is, can the sequential model's outputs be reconstructed from the distributed model's outputs? Our approach, implemented in Entangle, uses iterative rewriting to prove model refinement. Our approach can scale to today's large models and deployments: we evaluate it using GPT and Llama-3. Further, it provides actionable outputs that aids in bug localization.

  • On Scaling Up 3D Gaussian Splatting Training

    Lecture notes in computer science · 2025-01-01 · 13 citations

    book-chapter

Recent grants

Frequent coauthors

  • Scott Shenker

    University of California, Berkeley

    158 shared
  • Sylvia Ratnasamy

    Google (United States)

    39 shared
  • James McCauley

    Mount Holyoke College

    26 shared
  • Mooly Sagiv

    23 shared
  • Arvind Krishnamurthy

    Stanford University

    21 shared
  • Colin Scott

    Microsoft Research (India)

    16 shared
  • Justine Sherry

    Carnegie Mellon University

    15 shared
  • Yotam Harchol

    15 shared

Education

  • Ph.D.

    UC Berkeley

  • Other, Math-CS

    Brown

Awards & honors

  • HotOS 2023 Best Paper Award
  • HotNets 2023 Best Student Paper Award
  • HotNets 2022 Best Paper Award
  • SIGCOMM 2019 Best Student Paper Award
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Aurojit Panda

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup