Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Huaicheng Li

Huaicheng Li

· Assistant ProfessorVerified

Virginia Tech · Computer Science

Active 2016–2026

h-index15
Citations869
Papers3627 last 5y
Funding
See your match with Huaicheng Li — sign in to PhdFit.Sign in

About

Huaicheng Li is an Assistant Professor in the Department of Computer Science at Virginia Tech. He holds a Ph.D. in computer science from the University of Chicago, obtained in 2020, and an M.S. in computer science from the same institution, earned in 2018. He completed his B.S. in computer science and technology at Wuhan University, China, in 2013. His research interests include operating systems, storage systems, memory systems, and systems architecture. He is based at the Gilbert Place location in Blacksburg, VA, and can be contacted via email at huaicheng@cs.vt.edu or by phone at (540) 231-4482.

Research topics

  • Computer Science
  • Operating system
  • Embedded system
  • Artificial Intelligence
  • Computer hardware
  • Telecommunications
  • Distributed computing

Selected publications

  • The impact of the three rights separation of rural homestead reform on farmers’ economic welfare: a county-level macro analysis

    Frontiers in Sustainable Food Systems · 2026-04-10

    articleOpen accessSenior author

    The three rights separation of rural homestead reform (TRSRH) is a key component of China’s ongoing innovation in rural land institutions. Its central aim is to optimize the bundle of rights associated with rural homesteads through the “separation of three rights,” thereby improving the efficiency of land resource allocation and enhancing farmers’ welfare. Using panel data for 2,545 counties from 2000 to 2022, this study employs a staggered difference-in-differences model to systematically evaluate the impact of the reform pilots on farmers’ economic well-being. The results show that the reform significantly increases rural residents’ per capita disposable income, suggesting its positive role in expanding property-based income, facilitating factor mobility, and strengthening institutional guarantees. Further analysis indicates that the reform’s effect is more pronounced in regions with higher levels of economic development, more active population mobility, or stronger locational advantages, and that low-income and low-welfare groups benefit to a greater extent. Mechanism tests reveal that the expansion of non-farm employment opportunities and improvements in infrastructure and public service provision are important channels through which the reform enhances farmers’ welfare. These findings provide useful policy implications for refining the rural homestead system and advancing rural revitalization.

  • Performance Predictability in Heterogeneous Memory

    2026-03-10

    articleOpen accessSenior author

    Heterogeneous memory combining DRAM and CXL exhibits variable performance, yet existing metrics correlate weakly with actual slowdown. We present CAMP, a principled framework for predicting CXL-induced slowdown. Our key insight is that a DRAM run (plus a CXL run for bandwidth-bound workloads) exposes the causal microarchitectural pressure points where CXL latency translates into additional processor stall cycles. CAMP captures these signals using 12 performance counters to analytically decompose slowdown into three orthogonal components: demand reads, cache/prefetching, and stores. CAMP also introduces a closed-form model for software-based weighted interleaving that predicts performance across DRAM--CXL ratios. Across 265 workloads on NUMA and three CXL devices, CAMP achieves 91--97% prediction accuracy within 10% absolute error. We demonstrate that these models enable practical system policies, including ''Best-shot'' interleaving and colocated workload placement, improving performance by up to 21% and 23% over existing tiering and colocation approaches.

  • Carbon trading system and low-carbon economic efficiency in China: considering the roles of international capital flows and regional coopetition

    Environment Development and Sustainability · 2026-03-15

    articleSenior authorCorresponding
  • PACT: A Criticality-First Design for Tiered Memory

    2026-03-10

    articleOpen accessSenior author

    Tiered memory systems typically place pages based on access frequency (hotness), yet frequency alone fails to capture the true performance impact. We present PACT, an online, page-granular tiered memory design that elevates performance criticality to a first-class design principle. At its core is Per-page Access Criticality (PAC), a fine-grained metric that quantifies each page's contribution to application performance rather than merely counting accesses. PACT profiles PAC online using a lightweight analytical model that uniquely decomposes per-tier memory-level parallelism via hardware queue occupancy counters, enabling direct CPU stall attribution to individual pages. To handle highly skewed PAC distributions, PACT employs PAC-centric migration policies: eager demotion and adaptive promotion, to dynamically place performance-critical pages in DRAM. Across 13 workloads, PACT achieves up to 61% performance improvement over the best of 7 state-of-the-art tiering designs with up to 50× fewer migrations.

  • Systematic CXL Memory Characterization and Performance Analysis at Scale

    2025-03-27 · 26 citations

    articleOpen accessSenior author

    Compute Express Link (CXL) has emerged as a pivotal interconnect for memory expansion. Despite its potential, the performance implications of CXL across devices, latency regimes, processors, and workloads remain underexplored. We present Melody, a framework for systematic characterization and analysis of CXL memory performance. Melody builds on an extensive evaluation spanning 265 workloads, 4 real CXL devices, 7 latency levels, and 5 CPU platforms. Melody yields many insights: workload sensitivity to sub-μs CXL latencies (140-410ns), the first disclosure of CXL tail latencies, CPU tolerance to CXL latencies, a novel approach (SPA) for pinpointing CXL bottlenecks, and CPU prefetcher inefficiencies under CXL.

  • SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs

    2025-02-28 · 4 citations

    articleOpen accessSenior author

    Cloud service providers heavily colocate high-priority, latency sensitive (LS), and low-priority, best-effort (BE) DNN inference services on the same GPU to improve resource utilization in data centers. Among the critical shared GPU resources, there has been very limited analysis on the dynamic allocation of compute units and VRAM bandwidth, mainly for two reasons: (1) The native GPU resource management solutions are either hardware-specific, or unable to dynamically allocate resources to different tenants, or both; (2) NVIDIA doesn't expose interfaces for VRAM bandwidth allocation, and the software stack and VRAM channel architectures are black-box, both of which limit the software-level resource management. These drive prior work to design either conservative sharing policies detrimental to throughput, or static resource partitioning only applicable to a few GPU models.

  • New media environment, green technological innovation and corporate productivity: Evidence from listed companies in China

    Energy Economics · 2024-02-07 · 55 citations

    articleSenior author
  • SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs

    arXiv (Cornell University) · 2024-07-19 · 2 citations

    preprintOpen access

    Cloud service providers heavily colocate high-priority, latency-sensitive (LS), and low-priority, best-effort (BE) DNN inference services on the same GPU to improve resource utilization in data centers. Among the critical shared GPU resources, there has been very limited analysis on the dynamic allocation of compute units and VRAM bandwidth, mainly for two reasons: (1) The native GPU resource management solutions are either hardware-specific, or unable to dynamically allocate resources to different tenants, or both; (2) NVIDIA doesn't expose interfaces for VRAM bandwidth allocation, and the software stack and VRAM channel architectures are black-box, both of which limit the software-level resource management. These drive prior work to design either conservative sharing policies detrimental to throughput, or static resource partitioning only applicable to a few GPU models. To bridge this gap, this paper proposes SGDRC, a fully software-defined dynamic VRAM bandwidth and compute unit management solution for concurrent DNN inference services. SGDRC aims at guaranteeing service quality, maximizing the overall throughput, and providing general applicability to NVIDIA GPUs. SGDRC first reveals a general VRAM channel hash mapping architecture of NVIDIA GPUs through comprehensive reverse engineering and eliminates VRAM channel conflicts using software-level cache coloring. SGDRC applies bimodal tensors and tidal SM masking to dynamically allocate VRAM bandwidth and compute units, and guides the allocation of resources based on offline profiling. We evaluate 11 mainstream DNNs with real-world workloads on two NVIDIA GPUs. The results show that compared with the state-of-the-art GPU sharing solutions, SGDRC achieves the highest SLO attainment rates (99.0% on average), and improves overall throughput by up to 1.47x and BE job throughput by up to 2.36x.

  • Tuning Fast Memory Size based on Modeling of Page Migration for Tiered Memory

    arXiv (Cornell University) · 2024-10-01

    preprintOpen access

    Tiered memory, built upon a combination of fast memory and slow memory, provides a cost-effective solution to meet ever-increasing requirements from emerging applications for large memory capacity. Reducing the size of fast memory is valuable to improve memory utilization in production and reduce production costs because fast memory tends to be expensive. However, deciding the fast memory size is challenging because there is a complex interplay between application characterization and the overhead of page migration used to mitigate the impact of limited fast memory capacity. In this paper, we introduce a system, Tuna, to decide fast memory size based on modeling of page migration. Tuna uses micro-benchmarking to model the impact of page migration on application performance using three metrics. Tuna decides the fast memory size based on offline modeling results and limited information on workload telemetry. Evaluating with common big-memory applications and using 5% as the performance loss target, we show that Tuna in combination with a page management system (TPP) saves fast memory by 8.5% on average (up to 16%). This is in contrast to the 5% saving in fast memory reported by Microsoft Pond for the same workloads (BFS and SSSP) and the same performance loss target.

  • Oxygen vacancies nanoarchitectonics in BiVO4/WO3 heterostructured photoanode for effective berberine wastewater purification and electricity generation

    Journal of the Taiwan Institute of Chemical Engineers · 2024-04-30 · 10 citations

    article

Frequent coauthors

  • Haryadi S. Gunawi

    University of Chicago

    16 shared
  • Mingzhe Hao

    China Three Gorges University

    13 shared
  • Xing Lin

    NetApp (United States)

    8 shared
  • Mark D. Hill

    Microsoft (United States)

    7 shared
  • Daniel S. Berger

    7 shared
  • Andrew Baptist

    University of Utah

    6 shared
  • Ricardo Bianchini

    6 shared
  • Stanko Novaković

    5 shared

Education

  • PhD, Computer Science

    University of Chicago

    2020
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Huaicheng Li

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup