
Jonathan Appavoo
· Associate ProfessorVerifiedBoston University · Computer Science
Active 1999–2026
About
Jonathan Appavoo is an Associate Professor in the Computer Science department at Boston University. His work focuses on advancing computer systems with an emphasis on high performance, energy efficiency, and innovation in open-source software. He is involved in research projects such as SESA and PSML and has received multiple grants from the RedHat Collaboratory (RHCOLAB) to support his work on stream processing, Linux evolution, computational caching, and open source education. Appavoo's research contributions include developing frameworks and architectures for cloud computing, operating system innovation, and scalable elastic systems. He has collaborated extensively with colleagues and students on topics ranging from serverless computing to neuromorphic approaches for general purpose computation. His work aims to enable responsible and impactful technology development that supports diversity and human ingenuity in the digital future.
Research topics
- Computer Science
- Embedded system
- Operating system
- Parallel computing
- Engineering
- Programming language
Selected publications
Taming and Controlling Performance and Energy Trade-Offs Automatically in Network Applications
Cloud Computing and Data Science · 2026-03-11
articleOpen accessSenior authorIn this paper, we demonstrate that a server running a single latency-sensitive application can be treated as a black box to reduce energy consumption while meeting a Service-Level Agreement (SLA) target. We find that it is possible to identify “sweet spot” settings for packet batching and processing rate control. These settings represent optimal trade-offs between the software stack and hardware. Specifically, they account for both the arrival rate and the composition of requests being served. By testing a few combinations of these settings on the live system, a proof-of concept controller can dynamically find settings that reduce energy consumption while meeting a desired tail latency for the request rate. Our work demonstrates three key findings. First, without software changes, energy savings of up to 60% are achievable across diverse hardware systems by controlling batching and processing rates. Second, specialized research Operating Systems (OSes) can leverage this to achieve a further 40% energy savings over general-purpose OSes. Finally, we show that a controller that is agnostic to the application, system, and hardware, can find energy efficient settings for different request rates while meeting performance objectives.
Towards Performance and Energy Aware Kubernetes Scheduler
ACM SIGEnergy Energy Informatics Review · 2025-07-01 · 4 citations
articleSenior authorAs cloud services become increasingly latency-sensitive and data center energy usage rises, there is an urgent need to address both operational and embodied carbon cost. However, data centers often overprovision resources, resulting in resource under-utilization. These inefficiencies not only waste energy but also accelerate hardware refresh cycles, exacerbating embodied emissions. In this work, we present PAX, a performance and energy aware Kubernetes scheduler that leverages machine learning techniques. Specifically, we present preliminary results from using Bayesian optimization to optimize microservices across a heterogeneous cluster. PAX improves application performance compared to modern schedulers and enables carbon-conscious scheduling by dynamically placing workloads on old and new servers based on performance sensitivity. The results illustrate an opportunity to reduce operational carbon while extending server lifetimes to mitigate embodied emissions. Our approach highlights the potential of ML-enhanced scheduling as a mechanism for improving both resource efficiency and sustainability in modern cloud infrastructures.
Can OS Specialization give new life to old carbon in the cloud?
2024-09-16 · 1 citations
articleOpen accessSenior authorIs there "fat" (overheads) in cloud computing infrastructure software that can be trimmed? Would doing so help ameliorate the need for frequent hardware refreshes and extend the life of existing hardware? In this paper, we demonstrate that, indeed, there is "fat" that can be trimmed by using specialized OS-based software stacks. Doing so can allow decade-old computers to be used for critical cloud infrastructure services, potentially yielding 3x improvements in efficiency compared to standard software stacks on newer hardware. The implications of these results raise the possibility of exploiting OS optimizations to reduce server hardware obsolescence. Further, it suggests the importance of addressing the key portability challenges of specialized OS stacks.
2023 · 18 citations
- Computer Science
- Operating system
- Computer Science
This paper presents Unikernel Linux (UKL), a path toward integrating unikernel optimization techniques in Linux, a general purpose operating system. UKL adds a configuration option to Linux allowing for a single, optimized process to link with the kernel directly, and run at supervisor privilege. This UKL process does not require application source code modification, only a re-link with our, slightly modified, Linux kernel and glibc. Unmodified applications show modest performance gains out of the box, and developers can further optimize applications for more significant gains (e.g. 26% throughput improvement for Redis). UKL retains support for co-running multiple user level processes capable of communicating with the UKL process using standard IPC. UKL preserves Linux's battle-tested codebase, community, and ecosystem of tools, applications, and hardware support. UKL runs both on bare-metal and virtual servers and supports multi-core execution. The changes to the Linux kernel are modest (1250 LOC).
arXiv (Cornell University) · 2022-06-01
preprintOpen accessThis paper presents Unikernel Linux (UKL), a path toward integrating unikernel optimization techniques in Linux, a general purpose operating system. UKL adds a configuration option to Linux allowing for a single, optimized process to link with the kernel directly, and run at supervisor privilege. This UKL process does not require application source code modification, only a re-link with our, slightly modified, Linux kernel and glibc. Unmodified applications show modest performance gains out of the box, and developers can further optimize applications for more significant gains (e.g. 26% throughput improvement for Redis). UKL retains support for co-running multiple user level processes capable of communicating with the UKL process using standard IPC. UKL preserves Linux's battle-tested codebase, community, and ecosystem of tools, applications, and hardware support. UKL runs both on bare-metal and virtual servers and supports multi-core execution. The changes to the Linux kernel are modest (1250 LOC).
Slowing Down for Performance and Energy: An OS-Centric Study in Network Driven Workloads
arXiv (Cornell University) · 2021-12-13
preprintOpen accessSenior authorThis paper studies three fundamental aspects of an OS that impact the performance and energy efficiency of network processing: 1) batching, 2) processor energy settings, and 3) the logic and instructions of the OS networking paths. A network device's interrupt delay feature is used to induce batching and processor frequency is manipulated to control the speed of instruction execution. A baremetal library OS is used to explore OS path specialization. This study shows how careful use of batching and interrupt delay results in 2X energy and performance improvements across different workloads. Surprisingly, we find polling can be made energy efficient and can result in gains up to 11X over baseline Linux. We developed a methodology and a set of tools to collect system data in order to understand how energy is impacted at a fine-grained granularity. This paper identifies a number of other novel findings that have implications in OS design for networked applications and suggests a path forward to consider energy as a focal point of systems research.
The Virtual Block Interface: A Flexible Alternative to the Conventional Virtual Memory Framework
2020-05-01 · 6 citations
preprintOpen accessComputers continue to diversify with respect to system designs, emerging memory technologies, and application memory demands. Unfortunately, continually adapting the conventional virtual memory framework to each possible system configuration is challenging, and often results in performance loss or requires non-trivial workarounds. To address these challenges, we propose a new virtual memory framework, the Virtual Block Interface (VBI). We design VBI based on the key idea that delegating memory management duties to hardware can reduce the overheads and software complexity associated with virtual memory. VBI introduces a set of variable-sized virtual blocks (VBs) to applications. Each VB is a contiguous region of the globally-visible VBI address space, and an application can allocate each semantically meaningful unit of information (e.g., a data structure) in a separate VB. VBI decouples access protection from memory allocation and address translation. While the OS controls which programs have access to which VBs, dedicated hardware in the memory controller manages the physical memory allocation and address translation of the VBs. This approach enables several architectural optimizations to (1) efficiently and flexibly cater to different and increasingly diverse system configurations, and (2) eliminate key inefficiencies of conventional virtual memory. We demonstrate the benefits of VBI with two important use cases: (1) reducing the overheads of address translation (for both native execution and virtual machine environments), as VBI reduces the number of translation requests and associated memory accesses; and (2) two heterogeneous main memory architectures, where VBI increases the effectiveness of managing fast memory regions. For both cases, VBI significantly improves performance over conventional virtual memory.
2020 · 144 citations
Senior authorCorresponding- Computer Science
- Computer Science
- Operating system
This paper presents a system-level method for achieving the rapid deployment and high-density caching of serverless functions in a FaaS environment. For reduced start times, functions are deployed from unikernel snapshots, bypassing expensive initialization steps. To reduce the memory footprint of snapshots we apply page-level sharing across the entire software stack that is required to run a function. We demonstrate the effects of our techniques by replacing Linux on the compute node of a FaaS platform architecture. With our prototype OS, the deployment time of a function drops from 100s of milliseconds to under 10 ms. Platform throughput improves by 51x on workload composed entirely of new functions. We are able to cache over 50,000 function instances in memory as opposed to 3,000 using standard OS techniques. In combination, these improvements give the FaaS platform a new ability to handle large-scale bursts of requests.
The Virtual Block Interface: A Flexible Alternative to the Conventional\n Virtual Memory Framework
arXiv (Cornell University) · 2020-05-19
preprintOpen accessComputers continue to diversify with respect to system designs, emerging\nmemory technologies, and application memory demands. Unfortunately, continually\nadapting the conventional virtual memory framework to each possible system\nconfiguration is challenging, and often results in performance loss or requires\nnon-trivial workarounds. To address these challenges, we propose a new virtual\nmemory framework, the Virtual Block Interface (VBI). We design VBI based on the\nkey idea that delegating memory management duties to hardware can reduce the\noverheads and software complexity associated with virtual memory. VBI\nintroduces a set of variable-sized virtual blocks (VBs) to applications. Each\nVB is a contiguous region of the globally-visible VBI address space, and an\napplication can allocate each semantically meaningful unit of information\n(e.g., a data structure) in a separate VB. VBI decouples access protection from\nmemory allocation and address translation. While the OS controls which programs\nhave access to which VBs, dedicated hardware in the memory controller manages\nthe physical memory allocation and address translation of the VBs. This\napproach enables several architectural optimizations to (1) efficiently and\nflexibly cater to different and increasingly diverse system configurations, and\n(2) eliminate key inefficiencies of conventional virtual memory. We demonstrate\nthe benefits of VBI with two important use cases: (1) reducing the overheads of\naddress translation (for both native execution and virtual machine\nenvironments), as VBI reduces the number of translation requests and associated\nmemory accesses; and (2) two heterogeneous main memory architectures, where VBI\nincreases the effectiveness of managing fast memory regions. For both cases,\nVBI significanttly improves performance over conventional virtual memory.\n
SEUSS: Rapid serverless deployment using environment snapshots
OpenBU (Boston University) · 2019-10-03 · 2 citations
preprintOpen accessSenior authorModern FaaS systems perform well in the case of repeat executions when function working sets stay small. However, these platforms are less effective when applied to more complex, large-scale and dynamic workloads. In this paper, we introduce SEUSS (serverless execution via unikernel snapshot stacks), a new system-level approach for rapidly deploying serverless functions. Through our approach, we demonstrate orders of magnitude improvements in function start times and cacheability, which improves common re-execution paths while also unlocking previously-unsupported large-scale bursty workloads.
Recent grants
CAREER: Programmable Smart Machines
NSF · $595k · 2013–2018
Frequent coauthors
- 34 shared
Orran Krieger
- 16 shared
Robert W. Wisniewski
- 16 shared
Amos Waterland
- 14 shared
Dilma Da Silva
- 11 shared
Dan Schatzberg
Meta (United States)
- 11 shared
James Cadden
- 11 shared
Bryan S. Rosenburg
- 8 shared
Marc Auslander
IBM (United States)
Labs
Research in computer systems, including serverless computing, operating systems, and open-source software.
Education
- 2006
Ph.D.
University of Toronto
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Jonathan Appavoo
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup