Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Jonathan Appavoo

Jonathan Appavoo

· Associate ProfessorVerified

Boston University · Computer Science

Active 1999–2026

h-index20
Citations1.5k
Papers669 last 5y
Funding$595k
See your match with Jonathan Appavoo — sign in to PhdFit.Sign in

About

Jonathan Appavoo is an Associate Professor in the Computer Science department at Boston University. His work focuses on advancing computer systems with an emphasis on high performance, energy efficiency, and innovation in open-source software. He is involved in research projects such as SESA and PSML and has received multiple grants from the RedHat Collaboratory (RHCOLAB) to support his work on stream processing, Linux evolution, computational caching, and open source education. Appavoo's research contributions include developing frameworks and architectures for cloud computing, operating system innovation, and scalable elastic systems. He has collaborated extensively with colleagues and students on topics ranging from serverless computing to neuromorphic approaches for general purpose computation. His work aims to enable responsible and impactful technology development that supports diversity and human ingenuity in the digital future.

Research topics

  • Computer Science
  • Embedded system
  • Operating system
  • Parallel computing
  • Engineering
  • Programming language

Selected publications

  • Taming and Controlling Performance and Energy Trade-Offs Automatically in Network Applications

    Cloud Computing and Data Science · 2026-03-11

    articleOpen accessSenior author

    In this paper, we demonstrate that a server running a single latency-sensitive application can be treated as a black box to reduce energy consumption while meeting a Service-Level Agreement (SLA) target. We find that it is possible to identify “sweet spot” settings for packet batching and processing rate control. These settings represent optimal trade-offs between the software stack and hardware. Specifically, they account for both the arrival rate and the composition of requests being served. By testing a few combinations of these settings on the live system, a proof-of concept controller can dynamically find settings that reduce energy consumption while meeting a desired tail latency for the request rate. Our work demonstrates three key findings. First, without software changes, energy savings of up to 60% are achievable across diverse hardware systems by controlling batching and processing rates. Second, specialized research Operating Systems (OSes) can leverage this to achieve a further 40% energy savings over general-purpose OSes. Finally, we show that a controller that is agnostic to the application, system, and hardware, can find energy efficient settings for different request rates while meeting performance objectives.

  • Towards Performance and Energy Aware Kubernetes Scheduler

    ACM SIGEnergy Energy Informatics Review · 2025-07-01 · 4 citations

    articleSenior author

    As cloud services become increasingly latency-sensitive and data center energy usage rises, there is an urgent need to address both operational and embodied carbon cost. However, data centers often overprovision resources, resulting in resource under-utilization. These inefficiencies not only waste energy but also accelerate hardware refresh cycles, exacerbating embodied emissions. In this work, we present PAX, a performance and energy aware Kubernetes scheduler that leverages machine learning techniques. Specifically, we present preliminary results from using Bayesian optimization to optimize microservices across a heterogeneous cluster. PAX improves application performance compared to modern schedulers and enables carbon-conscious scheduling by dynamically placing workloads on old and new servers based on performance sensitivity. The results illustrate an opportunity to reduce operational carbon while extending server lifetimes to mitigate embodied emissions. Our approach highlights the potential of ML-enhanced scheduling as a mechanism for improving both resource efficiency and sustainability in modern cloud infrastructures.

  • Can OS Specialization give new life to old carbon in the cloud?

    2024-09-16 · 1 citations

    articleOpen accessSenior author

    Is there "fat" (overheads) in cloud computing infrastructure software that can be trimmed? Would doing so help ameliorate the need for frequent hardware refreshes and extend the life of existing hardware? In this paper, we demonstrate that, indeed, there is "fat" that can be trimmed by using specialized OS-based software stacks. Doing so can allow decade-old computers to be used for critical cloud infrastructure services, potentially yielding 3x improvements in efficiency compared to standard software stacks on newer hardware. The implications of these results raise the possibility of exploiting OS optimizations to reduce server hardware obsolescence. Further, it suggests the importance of addressing the key portability challenges of specialized OS stacks.

  • Unikernel Linux (UKL)

    2023 · 18 citations

    • Computer Science
    • Operating system
    • Computer Science

    This paper presents Unikernel Linux (UKL), a path toward integrating unikernel optimization techniques in Linux, a general purpose operating system. UKL adds a configuration option to Linux allowing for a single, optimized process to link with the kernel directly, and run at supervisor privilege. This UKL process does not require application source code modification, only a re-link with our, slightly modified, Linux kernel and glibc. Unmodified applications show modest performance gains out of the box, and developers can further optimize applications for more significant gains (e.g. 26% throughput improvement for Redis). UKL retains support for co-running multiple user level processes capable of communicating with the UKL process using standard IPC. UKL preserves Linux's battle-tested codebase, community, and ecosystem of tools, applications, and hardware support. UKL runs both on bare-metal and virtual servers and supports multi-core execution. The changes to the Linux kernel are modest (1250 LOC).

  • Unikernel Linux (UKL)

    arXiv (Cornell University) · 2022-06-01

    preprintOpen access

    This paper presents Unikernel Linux (UKL), a path toward integrating unikernel optimization techniques in Linux, a general purpose operating system. UKL adds a configuration option to Linux allowing for a single, optimized process to link with the kernel directly, and run at supervisor privilege. This UKL process does not require application source code modification, only a re-link with our, slightly modified, Linux kernel and glibc. Unmodified applications show modest performance gains out of the box, and developers can further optimize applications for more significant gains (e.g. 26% throughput improvement for Redis). UKL retains support for co-running multiple user level processes capable of communicating with the UKL process using standard IPC. UKL preserves Linux's battle-tested codebase, community, and ecosystem of tools, applications, and hardware support. UKL runs both on bare-metal and virtual servers and supports multi-core execution. The changes to the Linux kernel are modest (1250 LOC).

  • Slowing Down for Performance and Energy: An OS-Centric Study in Network Driven Workloads

    arXiv (Cornell University) · 2021-12-13

    preprintOpen accessSenior author

    This paper studies three fundamental aspects of an OS that impact the performance and energy efficiency of network processing: 1) batching, 2) processor energy settings, and 3) the logic and instructions of the OS networking paths. A network device's interrupt delay feature is used to induce batching and processor frequency is manipulated to control the speed of instruction execution. A baremetal library OS is used to explore OS path specialization. This study shows how careful use of batching and interrupt delay results in 2X energy and performance improvements across different workloads. Surprisingly, we find polling can be made energy efficient and can result in gains up to 11X over baseline Linux. We developed a methodology and a set of tools to collect system data in order to understand how energy is impacted at a fine-grained granularity. This paper identifies a number of other novel findings that have implications in OS design for networked applications and suggests a path forward to consider energy as a focal point of systems research.

  • The Virtual Block Interface: A Flexible Alternative to the Conventional Virtual Memory Framework

    2020-05-01 · 6 citations

    preprintOpen access

    Computers continue to diversify with respect to system designs, emerging memory technologies, and application memory demands. Unfortunately, continually adapting the conventional virtual memory framework to each possible system configuration is challenging, and often results in performance loss or requires non-trivial workarounds. To address these challenges, we propose a new virtual memory framework, the Virtual Block Interface (VBI). We design VBI based on the key idea that delegating memory management duties to hardware can reduce the overheads and software complexity associated with virtual memory. VBI introduces a set of variable-sized virtual blocks (VBs) to applications. Each VB is a contiguous region of the globally-visible VBI address space, and an application can allocate each semantically meaningful unit of information (e.g., a data structure) in a separate VB. VBI decouples access protection from memory allocation and address translation. While the OS controls which programs have access to which VBs, dedicated hardware in the memory controller manages the physical memory allocation and address translation of the VBs. This approach enables several architectural optimizations to (1) efficiently and flexibly cater to different and increasingly diverse system configurations, and (2) eliminate key inefficiencies of conventional virtual memory. We demonstrate the benefits of VBI with two important use cases: (1) reducing the overheads of address translation (for both native execution and virtual machine environments), as VBI reduces the number of translation requests and associated memory accesses; and (2) two heterogeneous main memory architectures, where VBI increases the effectiveness of managing fast memory regions. For both cases, VBI significantly improves performance over conventional virtual memory.

  • SEUSS

    2020 · 144 citations

    Senior authorCorresponding
    • Computer Science
    • Computer Science
    • Operating system

    This paper presents a system-level method for achieving the rapid deployment and high-density caching of serverless functions in a FaaS environment. For reduced start times, functions are deployed from unikernel snapshots, bypassing expensive initialization steps. To reduce the memory footprint of snapshots we apply page-level sharing across the entire software stack that is required to run a function. We demonstrate the effects of our techniques by replacing Linux on the compute node of a FaaS platform architecture. With our prototype OS, the deployment time of a function drops from 100s of milliseconds to under 10 ms. Platform throughput improves by 51x on workload composed entirely of new functions. We are able to cache over 50,000 function instances in memory as opposed to 3,000 using standard OS techniques. In combination, these improvements give the FaaS platform a new ability to handle large-scale bursts of requests.

  • The Virtual Block Interface: A Flexible Alternative to the Conventional\n Virtual Memory Framework

    arXiv (Cornell University) · 2020-05-19

    preprintOpen access

    Computers continue to diversify with respect to system designs, emerging\nmemory technologies, and application memory demands. Unfortunately, continually\nadapting the conventional virtual memory framework to each possible system\nconfiguration is challenging, and often results in performance loss or requires\nnon-trivial workarounds. To address these challenges, we propose a new virtual\nmemory framework, the Virtual Block Interface (VBI). We design VBI based on the\nkey idea that delegating memory management duties to hardware can reduce the\noverheads and software complexity associated with virtual memory. VBI\nintroduces a set of variable-sized virtual blocks (VBs) to applications. Each\nVB is a contiguous region of the globally-visible VBI address space, and an\napplication can allocate each semantically meaningful unit of information\n(e.g., a data structure) in a separate VB. VBI decouples access protection from\nmemory allocation and address translation. While the OS controls which programs\nhave access to which VBs, dedicated hardware in the memory controller manages\nthe physical memory allocation and address translation of the VBs. This\napproach enables several architectural optimizations to (1) efficiently and\nflexibly cater to different and increasingly diverse system configurations, and\n(2) eliminate key inefficiencies of conventional virtual memory. We demonstrate\nthe benefits of VBI with two important use cases: (1) reducing the overheads of\naddress translation (for both native execution and virtual machine\nenvironments), as VBI reduces the number of translation requests and associated\nmemory accesses; and (2) two heterogeneous main memory architectures, where VBI\nincreases the effectiveness of managing fast memory regions. For both cases,\nVBI significanttly improves performance over conventional virtual memory.\n

  • SEUSS: Rapid serverless deployment using environment snapshots

    OpenBU (Boston University) · 2019-10-03 · 2 citations

    preprintOpen accessSenior author

    Modern FaaS systems perform well in the case of repeat executions when function working sets stay small. However, these platforms are less effective when applied to more complex, large-scale and dynamic workloads. In this paper, we introduce SEUSS (serverless execution via unikernel snapshot stacks), a new system-level approach for rapidly deploying serverless functions. Through our approach, we demonstrate orders of magnitude improvements in function start times and cacheability, which improves common re-execution paths while also unlocking previously-unsupported large-scale bursty workloads.

Recent grants

Frequent coauthors

  • Orran Krieger

    34 shared
  • Robert W. Wisniewski

    16 shared
  • Amos Waterland

    16 shared
  • Dilma Da Silva

    14 shared
  • Dan Schatzberg

    Meta (United States)

    11 shared
  • James Cadden

    11 shared
  • Bryan S. Rosenburg

    11 shared
  • Marc Auslander

    IBM (United States)

    8 shared

Labs

  • Jonathan Appavoo's LabPI

    Research in computer systems, including serverless computing, operating systems, and open-source software.

Education

  • Ph.D.

    University of Toronto

    2006
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Jonathan Appavoo

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup