Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Nikos Hardavellas

Nikos Hardavellas

· Professor of Computer ScienceVerified

Northwestern University · Chemical Engineering

Active 1993–2026

h-index23
Citations3.3k
Papers12926 last 5y
Funding$688k
See your match with Nikos Hardavellas — sign in to PhdFit.Sign in

About

Nikos Hardavellas is a professor of Computer Science and Computer Engineering at Northwestern University, where he directs the Parallel Architecture Group at Northwestern (PARAG@N). His research focuses on computer architecture, specifically at the intersection of computer architecture with the computer systems stack, including programming languages, compilers, and operating systems. His work also encompasses memory systems, nanophotonics, energy-efficient computing, and quantum computing systems. Hardavellas serves on the Executive Committee of the Northwestern Institute for Quantum Information Research and Engineering (INQUIRE) and the Scientific Advisory Committee of the National Quantum Algorithms Center (NQAC). He has received numerous awards and recognitions, including an NSF CAREER award, being named a Future CRA Leader, and receiving best paper awards at various conferences. Prior to his academic career, he contributed to the design of several generations of Alpha microprocessors and high-end multiprocessor servers at Digital Equipment Corp., Compaq, and Hewlett-Packard.

Research topics

  • Computer Science
  • Parallel computing
  • Programming language
  • Artificial Intelligence
  • Operating system
  • Embedded system
  • Database
  • Distributed computing
  • Algorithm
  • Computer engineering
  • Computer graphics (images)

Selected publications

  • Extrapolating Pauli Checks for Expectation Value Estimation on Noisy Quantum Devices

    IEEE Transactions on Quantum Engineering · 2026-01-01

    articleOpen access

    Pauli Check Sandwiching (PCS) is an error detection scheme that protects quantum circuits by inserting pairs of parity checks and discarding runs that signal errors. However, each additional check introduces noise and exponentially increases sampling costs. To address these limitations, we propose Pauli Check Extrapolation (PCE), an error mitigation technique that obtains measured expectation values from circuits with different numbers of checks and, analogous to ZNE, extrapolates to the “maximum check” limit - the theoretical number of checks required for unit fidelity. We test linear and exponential ansatzes, deriving the exponential form from the Markovian error model. Benchmarking PCE against ZNE on random Clifford circuits with simulated depolarizing noise shows PCE outperforming ZNE for larger circuits. On real IBM hardware, PCE achieves an accuracy of up to 99.2% (56.2% improvement over baseline), compared to ZNE's 82% accuracy (29.1% improvement over baseline), for 4-qubit circuits. To demonstrate a practical use case, we then apply PCE towards mitigating errors in classical shadow measurements. Our results show that PCE can achieve fidelities greater than the state-of-the-art Robust Shadow estimation, while significantly reducing the number of required samples by eliminating the need for a calibration procedure. We validate these findings on both fully connected topologies and simulated IBM hardware backends.

  • Practical Machine Learning Autotuning for Large-Scale Collective Communication

    IEEE Transactions on Parallel and Distributed Systems · 2026-02-06

    articleSenior author
  • QuantEM: The quantum error management compiler

    ArXiv.org · 2025-09-19

    preprintOpen access

    As quantum computing advances toward fault-tolerant architectures, quantum error detection (QED) has emerged as a practical and scalable intermediate strategy in the transition from error mitigation to full error correction. By identifying and discarding faulty runs rather than correcting them, QED enables improved reliability with significantly lower overhead. Applying QED to arbitrary quantum circuits remains challenging, however, because of the need for manual insertion of detection subcircuits, ancilla allocation, and hardware-specific mapping and scheduling. We present QuantEM, a modular and extensible compiler designed to automate the integration of QED codes into arbitrary quantum programs. Our compiler consists of three key modules: (1) program analysis and transformation module to examine quantum programs in a QED-aware context and introduce checks and ancilla qubits, (2) error detection code integration module to map augmented circuits onto specific hardware backends, and (3) postprocessing and resource management for measurement results postprocessing and resource-efficient estimation techniques. The compiler accepts a high-level quantum circuit, a chosen error detection code, and a target hardware topology and then produces an optimized and executable circuit. It can also automatically select an appropriate detection code for the user based on circuit structure and resource estimates. QuantEM currently supports Pauli check sandwiching and Iceberg codes and is designed to support future QED schemes and hardware targets. By automating the complex QED compilation flow, this work reduces developer burden, enables fast code exploration, and ensures consistent and correct application of detection logic across architectures.

  • Modular Compilation for Quantum Chiplet Architectures

    ArXiv.org · 2025-01-14 · 1 citations

    preprintOpen accessSenior author

    As quantum computing technology matures, industry is adopting modular quantum architectures to keep quantum scaling on the projected path and meet performance targets. However, the complexity of chiplet-based quantum devices, coupled with their growing size, presents an imminent scalability challenge for quantum compilation. Contemporary compilation methods are not well-suited to chiplet architectures - in particular, existing qubit allocation methods are often unable to contend with inter-chiplet links, which don't necessarily support a universal basis gate set. Furthermore, existing methods of logical-to-physical qubit placement, swap insertion (routing), unitary synthesis, and/or optimization, are typically not designed for qubit links of significantly varying latency or fidelity. In this work, we propose SEQC, a hierarchical parallelized compilation pipeline optimized for chiplet-based quantum systems, including several novel methods for qubit placement, qubit routing, and circuit optimization. SEQC attains a $9.3\%$ average increase in circuit fidelity (up to $49.99\%$). Additionally, owing to its ability to parallelize compilation, SEQC achieves $3.27\times$ faster compilation on average (up to $6.74\times$) over a chiplet-unaware Qiskit baseline.

  • Pauli Check Extrapolation for Quantum Error Mitigation

    2024-09-15 · 1 citations

    article

    Pauli Check Sandwiching (PCS) is an error mitigation scheme that uses pairs of parity checks to detect errors in the payload circuit. While increasing the number of check pairs improves error detection, it also introduces additional noise to the circuit and exponentially increases the required sampling size. To address these limitations, we propose a novel error mitigation scheme, Pauli Check Extrapolation (PCE), which integrates PCS with an extrapolation technique similar to Zero-Noise Extrapolation (ZNE). However, instead of extrapolating to the ‘zero-noise’ limit, as is done in ZNE, PCE extrapolates to the ‘maximum check’ limit-the number of check pairs theoretically required to achieve unit fidelity. In this study, we focus on applying a linear model for extrapolation and also derive a more general exponential ansatz based on the Markovian error model. We demonstrate the effectiveness of PCE by using it to mitigate errors in the shadow estimation protocol, particularly for states prepared by the variational quantum eigensolver (VQE). Our results show that this method can achieve higher fidelities than the state-of-the-art Robust Shadow (RS) estimation scheme, while significantly reducing the number of required samples by eliminating the need for a calibration procedure. We validate these findings on both fully-connected topologies and simulated IBM hardware backends.

  • Dynamic Resource Allocation with Quantum Error Detection

    arXiv (Cornell University) · 2024-08-10

    preprintOpen access

    Quantum processing units (QPUs) are highly heterogeneous in terms of physical qubit performance. To add even more complexity, drift in quantum noise landscapes has been well-documented. This makes resource allocation a challenging problem whenever a quantum program must be mapped to hardware. As a solution, we propose a novel resource allocation framework that applies Pauli checks. Pauli checks have demonstrated their efficacy at error mitigation in prior work, and in this paper, we highlight their potential to infer the noise characteristics of a quantum system. Circuits with embedded Pauli checks can be executed on different regions of qubits, and the syndrome data created by error-detecting Pauli checks can be leveraged to guide quantum program outcomes toward regions that produce higher-fidelity final distributions. Using noisy simulation and a real QPU testbed, we show that dynamic quantum resource allocation with Pauli checks can outperform state-of-art mapping techniques, such as those that are noise-aware. Further, when applied toward the Quantum Approximate Optimization Algorithm, techniques guided by Pauli checks demonstrate the ability to increase circuit fidelity 11% on average, and up to 33%.

  • Extrapolating Pauli Checks for Expectation Value Estimation on Noisy Quantum Devices

    arXiv (Cornell University) · 2024-06-20

    preprintOpen access

    Pauli Check Sandwiching (PCS) is an error detection scheme that protects quantum circuits by inserting pairs of parity checks and discarding runs that signal errors. However, each additional check introduces noise and exponentially increases sampling costs. To address these limitations, we propose Pauli Check Extrapolation (PCE), an error mitigation technique that obtains measured expectation values from circuits with different numbers of checks and, analogous to ZNE, extrapolates to the ``maximum check'' limit -- the theoretical number of checks required for unit fidelity. We test linear and exponential ansatzes, deriving the exponential form from the Markovian error model. Benchmarking PCE against ZNE on random Clifford circuits with simulated depolarizing noise shows PCE outperforming ZNE for larger circuits. On real IBM hardware, PCE achieves an accuracy of up to 99.2% (56.2% improvement over baseline), compared to ZNE's 82% accuracy (29.1% improvement over baseline), for 4-qubit circuits. To demonstrate a practical use case, we then apply PCE towards mitigating errors in classical shadow measurements. Our results show that PCE can achieve fidelities greater than the state-of-the-art Robust Shadow estimation, while significantly reducing the number of required samples by eliminating the need for a calibration procedure. We validate these findings on both fully connected topologies and simulated IBM hardware backends.

  • Pauli Check Sandwiching for Quantum Characterization and Error Mitigation during Runtime

    2024-09-15

    article

    This work presents a novel quantum system characterization and error mitigation framework that applies Pauli check sandwiching (PCS). We motivate our work with prior art in software optimizations for quantum programs like noise-adaptive mapping and multi-programming, and we introduce the concept of PCS while emphasizing design considerations for its practical use. We show that by carefully embedding Pauli checks within a target application (i.e. a quantum circuit), we can learn quantum system noise profiles. Further, PCS combined with multi-programming unlocks non-trivial fidelity improvements.

  • Generalized Collective Algorithms for the Exascale Era

    2023-10-31 · 4 citations

    articleSenior author

    Exascale supercomputers have renewed the exigence of improving distributed communication, specifically MPI collectives. Previous works accelerated collectives for specific scenarios by changing the radix of the collective algorithms. However, these approaches fail to explore the interplay between modern hardware features, such as multi-port networks, and software features, such as message size. In this paper, we present a novel approach that uses system-agnostic, generalized (i.e., variableradix) algorithms to capture relevant features and provide broad speedups for upcoming exascale-class supercomputers.We identify hardware commonalities found on announced exascale systems and three omnipresent communication kernels (binomial tree, ring, and recursive doubling) that can be generalized to better leverage these features, creating 10 total implementations. For each kernel, we develop analytical models to intuit algorithm performance with varying radix values.Experiments on the world’s first exascale supercomputer (Frontier at ORNL) and a pre-exascale system (Polaris at ANL) show that our generalized algorithms outperform the baseline open-source and proprietary vendor MPI implementations by a significant margin, up to over 4.5x. We empirically determine optimal algorithms and parameter values, identifying where the analytical models are accurate and where hardware features directly determine performance. Most notably, we show how a single, system-agnostic implementation of a generalized algorithm can optimize for multiple hardware/software features across multiple systems.

  • Evaluating Functional Memory-Managed Parallel Languages for HPC using the NAS Parallel Benchmarks

    2023-05-01 · 1 citations

    article

    Functiona1, memory-managed parallel languages (FMPLs) are a recent innovative approach to shared-memory parallel programming. Despite their rising prevalence in other areas, FMPLs have yet to gain traction in HPC. In this work, we explore the utility of FMPLs for HPC by re-implementing the NAS Parallel Benchmarks in an FMPL.For this study, we ported the benchmarks into the Parallel ML language. We discuss the advantages and disadvantages of using Parallel ML for HPC applications based on our development experience. We compare the performance of our Parallel ML implementation to the existing C/OpenMP version. The FMPL implementations are $1.02 \times -5.76 \times$ slower compared to OpenMP. Our positive development experience combined with some competitive performance results suggest that FMPLs have the potential to become a viable choice for HPC applications. We conclude by describing our future work to automatically manage distributed memory within an FMPL, creating a compelling new programming model for HPC.

Recent grants

Frequent coauthors

  • Babak Falsafi

    28 shared
  • Ippokratis Pandis

    Amazon (United States)

    22 shared
  • Chris Wilkerson

    Intel (United States)

    19 shared
  • Tor M. Aamodt

    University of British Columbia

    19 shared
  • Anastasia Ailamaki

    17 shared
  • Lieven Eeckhout

    Ghent University

    17 shared
  • Jared C. Smolens

    Oracle (United States)

    17 shared
  • Juanita Hoe

    University of West London

    17 shared

Labs

  • Parallel Architecture Group at Northwestern (PARAG@N)PI

Education

  • PhD, CS

    Carnegie Mellon University

    2009
  • MSc, CS

    Carnegie Mellon University

    2006
  • MSc, CS

    University of Rochester

    1997
  • BSc, CS

    University of Crete

    1995

Awards & honors

  • Future CRA Leader by the Computing Research Association (202…
  • NSF CAREER award (2015)
  • Best paper awards, nominations and test-of-time awards at HP…
  • IEEE Micro Top Picks Award (2010)
  • IEEE Micro Top Picks Honorable Mention (2023)
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Nikos Hardavellas

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup