Yanzhi Wang

Verified

Northeastern University · Electrical and Energy Engineering

Active 2003–2026

h-index56

Citations13.9k

Papers734357 last 5y

Funding$1.0M

Faculty page

See your match with Yanzhi Wang — sign in to PhdFit.Sign in

About

Yanzhi Wang is a Professor of Electrical and Computer Engineering at Northeastern University College of Engineering. His research focuses on real-time and energy-efficient deep learning and artificial intelligence systems, model compression of deep neural networks (DNNs), neuromorphic computing, and non-von Neumann computing paradigms. He has contributed to the development of methods for compressing and accelerating neural network models, particularly for mobile and edge devices, with notable patents in this area. Wang has been recognized with numerous awards, including the 2024 Constantinos Mavroidis Translational Research Award, and has received significant funding for projects aimed at strengthening AI for critical edge applications and creating inclusive urban communities through technology. His work is highly cited, placing him among the top 2% of most-cited scientists worldwide, and he actively collaborates on research projects supported by agencies such as the National Science Foundation and the Army Research Office.

Research topics

Computer Science
Artificial Intelligence
Machine Learning
Computer engineering
Computer Security
Algorithm
Human–computer interaction
Electronic engineering
Mathematics
Computer architecture
Arithmetic
Computer network
Telecommunications
Programming language
Parallel computing

Selected publications

Dissipation-induced phase transitions in a frustrated four-spin plaquette spin-boson model
Frontiers of Physics · 2026-01-01
articleOpen accessSenior author
We investigate a frustrated four-spin plaquette spin-boson model with competing nearest-neighbor and diagonal Ising couplings, where each spin is coupled to an independent bosonic bath. Combining a path-integral strong-coupling analysis with variational matrix product state simulations, we obtain the ground-state phase diagram. In the strong-dissipation limit we map the model onto a classical plaquette and derive analytic phase boundaries between ferromagnetic, Néel, and stripe phases. At intermediate dissipation we find a delocalized phase and two localized ordered phases with Néel and stripe character. We show that the localized phases are separated by a first-order line, while each is connected to the delocalized regime via a continuous (second-order) localization transition, and that these three boundaries meet at a quantum triple point. Analysis of spin correlations and reduced density matrices further reveals that entanglement concentrates on nearest-neighbor (diagonal) bonds in the Néel (stripe) phase, whereas in the delocalized regime intra-plaquette two-spin entanglement is strongly suppressed in favor of enhanced spin-bath correlations.
Publisher OA PDF DOI
Picroside II as a Potential Anti-Inflammatory Agent
Pharmaceutics · 2026-04-17
articleOpen access
Inflammation, as a basic pathological process, is critically implicated in the pathogenesis and progression of numerous diseases. Picrorhizae rhizoma is a type of traditional Chinese medicine with prominent anti-inflammatory effect. And picroside II, a representative iridoid compound, is the major bioactive constituent of Picrorhizae rhizoma. Over recent decades, picroside II has garnered extensive research interest owing to its remarkable pharmacological efficacy. Accumulating evidence has validated that picroside II exerts significant anti-inflammatory effects in the prevention and treatment of various systemic diseases. This review comprehensively summarizes and updates the latest research advances of picroside II, systematically elaborating its anti-inflammatory molecular mechanisms, pharmacokinetic profiles, and safety evaluation characteristics. The integrated data and analyses in this review aim to provide solid theoretical support, reliable evidence, and novel insights for the in-depth mechanism research, rational medicinal development, and future clinical translation and application of picroside II.
Publisher DOI
HDCompression: Hybrid-Diffusion Image Compression for Ultra-low Bitrates
Lecture notes in computer science · 2026-01-01
preprintOpen access
Publisher OA PDF DOI
QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge
2025-06-10 · 3 citations
article
Monocular Depth Estimation (MDE) has emerged as a pivotal task in computer vision, supporting numerous real-world applications. However, deploying accurate depth estimation models on resource-limited edge devices, especially Application-Specific Integrated Circuits (ASICs), is challenging due to the high computational and memory demands. Recent advancements in foundational depth estimation deliver impressive results but further amplify the difficulty of deployment on ASICs. To address this, we propose Quart-Depth which adopts post-training quantization to quantize MDE models with hardware accelerations for ASICs. Our approach involves quantizing both weights and activations to 4-bit precision, reducing the model size and computation cost. To mitigate the performance degradation, we introduce activation polishing and compensation algorithm applied before and after activation quantization, as well as a weight reconstruction method for minimizing errors in weight quantization. Furthermore, we design a flexible and programmable hardware accelerator by supporting kernel fusion and customized instruction programmability, enhancing throughput and efficiency. Experimental results demonstrate that our framework achieves competitive accuracy while enabling fast inference and higher energy efficiency on ASICs, bridging the gap between high-performance depth estimation and practical edge-device applicability. Code: https://github.com/shawnricecake/quart-depth
Publisher DOI
A unified DNN weight compression framework using reweighted optimization methods
Intelligent Systems with Applications · 2025-07-15
articleOpen accessSenior author
To address the large model sizes and intensive computation requirements of deep neural networks (DNNs), weight pruning techniques have been proposed and generally fall into two categories: static regularization-based pruning and dynamic regularization-based pruning. However, the static method often leads to either complex operations or reduced accuracy, while the dynamic method requires extensive time to adjust parameters to maintain accuracy while achieving effective pruning. In this paper, we propose a unified robustness-aware framework for DNN weight pruning that dynamically updates regularization terms bounded by the designated constraint. This framework can generate both non-structured sparsity and different kinds of structured sparsity, and it incorporates adversarial training to enhance the robustness of the sparse model. We further extend our approach into an integrated framework capable of handling multiple DNN compression tasks. Experimental results show that our proposed method increases the compression rate-up to 630x for LeNet-5, 45x for AlexNet, 7.2x for MobileNet, 3.2x for ResNet-50-while also reducing training time and simplifying hyperparameter tuning to a single penalty parameter. Additionally, our method improves model robustness by 5.07% for ResNet-18 and 3.34% for VGG-16 under a 16xpruning rate, outperforming the state-of-the-art ADMM-based hard constraint method.
Publisher DOI
Open-Source Multimodal Moxin Models with Moxin-VLM and Moxin-VLA
Open MIND · 2025-12-22
preprintSenior author
Recently, Large Language Models (LLMs) have undergone a significant transformation, marked by a rapid rise in both their popularity and capabilities. Leading this evolution are proprietary LLMs like GPT-4 and GPT-o1, which have captured widespread attention in the AI community due to their remarkable performance and versatility. Simultaneously, open-source LLMs, such as LLaMA and Mistral, have made great contributions to the ever-increasing popularity of LLMs due to the ease to customize and deploy the models across diverse applications. Moxin 7B is introduced as a fully open-source LLM developed in accordance with the Model Openness Framework, which moves beyond the simple sharing of model weights to embrace complete transparency in training, datasets, and implementation detail, thus fostering a more inclusive and collaborative research environment that can sustain a healthy open-source ecosystem. To further equip Moxin with various capabilities in different tasks, we develop three variants based on Moxin, including Moxin-VLM, Moxin-VLA, and Moxin-Chinese, which target the vision-language, vision-language-action, and Chinese capabilities, respectively. Experiments show that our models achieve superior performance in various evaluations. We adopt open-source framework and open data for the training. We release our models, along with the available data and code to derive these models.
DOI
An Evolutionary Explanation for Depression in US Chinese International Students
Communications in Humanities Research · 2025-11-11
articleOpen access
In the current era of accelerating globalization, international educational exchanges are becoming increasingly frequent, and the scale of the international student body continues to expand. The mental health issues of international students have also drawn increasing attention. Take the United States as an example. In recent years, the mental health problems of teenagers have become increasingly serious. Mental health problems among Chinese international students (CIS) in the United States have gained increasing attention. Developing depression is often influenced by family financial support. This study hypothesizes that when support is perceived as a performance expectation, willingness to reciprocate (WTR) decreases, increasing depression risk; when perceived as a happiness expectation, WTR increases, reducing depression risk. Using the Patient Health Questionnaire-9 (PHQ-9) and a self-developed expectations scale, we found that CIS interpretations of familial support vary by relational model: communal sharing fosters security and lowers depression risk, while market pricing and authority ranking link support to performance, increasing pressure and depression. University services and culturally sensitive interventions can reframe support as unconditional care, reducing anxiety and strengthening WTR.
Publisher DOI
Taming Diffusion Transformer for Efficient Mobile Video Generation in Seconds
ArXiv.org · 2025-07-17
preprintOpen access
Diffusion Transformers (DiT) have shown strong performance in video generation tasks, but their high computational cost makes them impractical for resource-constrained devices like smartphones, and practical on-device generation is even more challenging. In this work, we propose a series of novel optimizations to significantly accelerate video generation and enable practical deployment on mobile platforms. First, we employ a highly compressed variational autoencoder (VAE) to reduce the dimensionality of the input data without sacrificing visual quality. Second, we introduce a KD-guided, sensitivity-aware tri-level pruning strategy to shrink the model size to suit mobile platforms while preserving critical performance characteristics. Third, we develop an adversarial step distillation technique tailored for DiT, which allows us to reduce the number of inference steps to four. Combined, these optimizations enable our model to achieve approximately 15 frames per second (FPS) generation speed on an iPhone 16 Pro Max, demonstrating the feasibility of efficient, high-quality video generation on mobile devices.
Publisher OA PDF DOI
FairSMOE: Mitigating Multi-Attribute Fairness Problem with Sparse Mixture-of-Experts
2025-09-01 · 1 citations
articleSenior author
Real‐world datasets usually contain multiple attributes, making it essential to ensure fairness across all of them simultaneously. However, different attributes may vary in difficulty, and no existing approaches have effectively addressed this issue. Consequently, an attribute‐adaptive strategy is needed to achieve fairness for all attributes. Multi‐task Learning (MTL) leverages shared information to optimize multiple tasks concurrently, while Sparsely‐Gated Mixture‐of‐Experts (SMoE) can dynamically allocate computational resources to the most needed tasks. In this work, we formulate multi‐attribute fairness issue as an MTL problem and employ SMoE to achieve desirable performance across all attributes simultaneously. We first analyze the feasibility and find the potentiality by formalizing multi-attribute fairness problem into a MTL problem and mitigating it by using SMoE. However, vanilla SMoE could lead to over-utilization problem which causes sub-optimal performance. We then proposed an innovative SMoE framework for multi-attribute fair image classification, which further improves multi-attribute fairness by redesigning the MoE layer and routing policy with fairness consideration. Extensive experiments demonstrated the effectiveness. Taking a DeiT-Small as the backbone, we achieve 77.25% and 86.01% accuracy on the ISIC2019 and CelebA dataset respectively with Multi-attribute Predictive Quality Disparity (PQD) score of 0.801 and 0.787, beating current state-of-the-art methods Muffin, InfoFair and MultiFair.
Publisher DOI
Open-Source Multimodal Moxin Models with Moxin-VLM and Moxin-VLA
ArXiv.org · 2025-12-22
articleOpen accessSenior author
Recently, Large Language Models (LLMs) have undergone a significant transformation, marked by a rapid rise in both their popularity and capabilities. Leading this evolution are proprietary LLMs like GPT-4 and GPT-o1, which have captured widespread attention in the AI community due to their remarkable performance and versatility. Simultaneously, open-source LLMs, such as LLaMA and Mistral, have made great contributions to the ever-increasing popularity of LLMs due to the ease to customize and deploy the models across diverse applications. Moxin 7B is introduced as a fully open-source LLM developed in accordance with the Model Openness Framework, which moves beyond the simple sharing of model weights to embrace complete transparency in training, datasets, and implementation detail, thus fostering a more inclusive and collaborative research environment that can sustain a healthy open-source ecosystem. To further equip Moxin with various capabilities in different tasks, we develop three variants based on Moxin, including Moxin-VLM, Moxin-VLA, and Moxin-Chinese, which target the vision-language, vision-language-action, and Chinese capabilities, respectively. Experiments show that our models achieve superior performance in various evaluations. We adopt open-source framework and open data for the training. We release our models, along with the available data and code to derive these models.
Publisher OA PDF

Recent grants

CNS Core: Small: Collaborative: Content-Based Viewport Prediction Framework for Live Virtual Reality Streaming
NSF · $171k · 2019–2024
SPX: Collaborative Research: FASTLEAP: FPGA based compact Deep Learning Platform
NSF · $350k · 2019–2024
FET: SHF: Small: Collaborative: Advanced Circuits, Architectures and Design Automation Technologies for Energy-efficient Single Flux Quantum Logic
NSF · $200k · 2020–2024
IRES Track I: U.S.-Japan International Research Experience for Students on Superconducting Electronics
NSF · $299k · 2019–2025

Frequent coauthors

Massoud Pedram
University of Southern California
126 shared
Xue Lin
Northeastern University
99 shared
Wei Niu
University of Georgia
81 shared
Geng Yuan
76 shared
Caiwen Ding
74 shared
Bin Ren
William & Mary
64 shared
Xiaolong Ma
Ocean University of China
57 shared
Zhengang Li
Universidad del Noreste
53 shared

Awards & honors

2024 Constantinos Mavroidis Translational Research Award
2022 College of Engineering Faculty Fellow
IEEE Technical Committee on Secure and Dependable Measuremen…
Army Research Office Young Investigator Award
Top Paper Award, IEEE Cloud Computing Conference (CLOUD)

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Yanzhi Wang

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you