Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Yang Liu

Yang Liu

· Professor

University of Illinois Urbana-Champaign · Bioengineering

Active 1981–2024

h-index104
Citations58.3k
Papers3.5k1864 last 5y
Funding$1.7M1 active
See your match with Yang Liu — sign in to PhdFit.Sign in

About

We are pioneering the future of precision medicine through cutting-edge multiscale optical microscopy, automation and robotics, artificial intelligence, and large-scale bioimage informatics. Our imaging techniques span seven orders of magnitude—from the nanoscale to the mesoscale—enabling transformative advancements in precision medicine. Our lab's fusion of cross-scale imaging and AI-driven systems biology is setting the stage for unprecedented scientific discoveries and transformative personalized medicine.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Machine Learning
  • Computer Security
  • Software engineering
  • Geography
  • Programming language
  • Data science
  • Data Mining
  • World Wide Web
  • Operating system
  • Computer network
  • Database
  • Engineering

Selected publications

  • ReCDroid+: Automated End-to-End Crash Reproduction from Bug Reports for Android Apps

    ACM Transactions on Software Engineering and Methodology · 2022 · 33 citations

    • Computer Science
    • Computer Science
    • Computer Security

    The large demand of mobile devices creates significant concerns about the quality of mobile applications (apps). Developers heavily rely on bug reports in issue tracking systems to reproduce failures (e.g., crashes). However, the process of crash reproduction is often manually done by developers, making the resolution of bugs inefficient, especially given that bug reports are often written in natural language. To improve the productivity of developers in resolving bug reports, in this paper, we introduce a novel approach, called ReCDroid+, that can automatically reproduce crashes from bug reports for Android apps. ReCDroid+ uses a combination of natural language processing (NLP) , deep learning, and dynamic GUI exploration to synthesize event sequences with the goal of reproducing the reported crash. We have evaluated ReCDroid+ on 66 original bug reports from 37 Android apps. The results show that ReCDroid+ successfully reproduced 42 crashes (63.6% success rate) directly from the textual description of the manually reproduced bug reports. A user study involving 12 participants demonstrates that ReCDroid+ can improve the productivity of developers when resolving crash bug reports.

  • Pre-trained models: Past, present and future

    AI Open · 2021 · 924 citations

    • Computer Science
    • Artificial Intelligence
    • Computer Science

    Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved great success and become a milestone in the field of artificial intelligence (AI). Owing to sophisticated pre-training objectives and huge model parameters, large-scale PTMs can effectively capture knowledge from massive labeled and unlabeled data. By storing knowledge into huge parameters and fine-tuning on specific tasks, the rich knowledge implicitly encoded in huge parameters can benefit a variety of downstream tasks, which has been extensively demonstrated via experimental verification and empirical analysis. It is now the consensus of the AI community to adopt PTMs as backbone for downstream tasks rather than learning models from scratch. In this paper, we take a deep look into the history of pre-training, especially its special relation with transfer learning and self-supervised learning, to reveal the crucial position of PTMs in the AI development spectrum. Further, we comprehensively review the latest breakthroughs of PTMs. These breakthroughs are driven by the surge of computational power and the increasing availability of data, towards four important directions: designing effective architectures, utilizing rich contexts, improving computational efficiency, and conducting interpretation and theoretical analysis. Finally, we discuss a series of open problems and research directions of PTMs, and hope our view can inspire and advance the future study of PTMs.

  • Learning to Detect Malicious Clients for Robust Federated Learning

    arXiv (Cornell University) · 2020 · 186 citations

    • Computer Science
    • Computer Science
    • Computer Security

    Federated learning systems are vulnerable to attacks from malicious clients. As the central server in the system cannot govern the behaviors of the clients, a rogue client may initiate an attack by sending malicious model updates to the server, so as to degrade the learning performance or enforce targeted model poisoning attacks (a.k.a. backdoor attacks). Therefore, timely detecting these malicious model updates and the underlying attackers becomes critically important. In this work, we propose a new framework for robust federated learning where the central server learns to detect and remove the malicious model updates using a powerful detection model, leading to targeted defense. We evaluate our solution in both image classification and sentiment analysis tasks with a variety of machine learning models. Experimental results show that our solution ensures robust federated learning that is resilient to both the Byzantine attacks and the targeted model poisoning attacks.

  • Boba: Authoring and Visualizing Multiverse Analyses

    IEEE Transactions on Visualization and Computer Graphics · 2020 · 79 citations

    1st authorCorresponding
    • Computer Science
    • Computer Science
    • Data science

    Multiverse analysis is an approach to data analysis in which all "reasonable" analytic decisions are evaluated in parallel and interpreted collectively, in order to foster robustness and transparency. However, specifying a multiverse is demanding because analysts must manage myriad variants from a cross-product of analytic decisions, and the results require nuanced interpretation. We contribute Baba: an integrated domain-specific language (DSL) and visual analysis system for authoring and reviewing multiverse analyses. With the Boba DSL, analysts write the shared portion of analysis code only once, alongside local variations defining alternative decisions, from which the compiler generates a multiplex of scripts representing all possible analysis paths. The Boba Visualizer provides linked views of model results and the multiverse decision space to enable rapid, systematic assessment of consequential decisions and robustness, including sampling uncertainty and model fit. We demonstrate Boba's utility through two data analysis case studies, and reflect on challenges and design opportunities for multiverse analysis software.

  • BatchCrypt: Efficient homomorphic encryption for cross-silo federated learning

    2020 · 275 citations

    Senior authorCorresponding
    • Computer Science
    • Computer Science
    • Computer Security

    Cross-silo federated learning (FL) enables organizations (e.g., financial or medical) to collaboratively train a machine learning model by aggregating local gradient updates from each client without sharing privacy-sensitive data. To ensure no update is revealed during aggregation, industrial FL frameworks allow clients to mask local gradient updates using additively homomorphic encryption (HE). However, this results in significant cost in computation and communication. In our characterization, HE operations dominate the training time, while inflating the data transfer amount by two orders of magnitude. In this paper, we present BatchCrypt, a system solution for cross-silo FL that substantially reduces the encryption and communication overhead caused by HE. Instead of encrypting individual gradients with full precision, we encode a batch of quantized gradients into a long integer and encrypt it in one go. To allow gradient-wise aggregation to be performed on ciphertexts of the encoded batches, we develop new quantization and encoding schemes along with a novel gradient clipping technique. We implemented BatchCrypt as a plug-in module in FATE, an industrial cross-silo FL framework. Evaluations with EC2 clients in geo-distributed datacenters show that BatchCrypt achieves 23×-93× training speedup while reducing the communication overhead by 66×-101×. The accuracy loss due to quantization errors is less than 1%. Copyright © Proc. of the 2020 USENIX Annual Technical Conference, ATC 2020. All rights reserved.

  • Machine Learning Testing: Survey, Landscapes and Horizons

    IEEE Transactions on Software Engineering · 2020 · 813 citations

    Senior authorCorresponding
    • Computer Science
    • Machine Learning
    • Computer Science

    This paper provides a comprehensive survey of techniques for testing machine learning systems; Machine Learning Testing (ML testing) research. It covers 144 papers on testing properties (e.g., correctness, robustness, and fairness), testing components (e.g., the data, learning program, and framework), testing workflow (e.g., test generation and test evaluation), and application scenarios (e.g., autonomous driving, machine translation). The paper also analyses trends concerning datasets, research trends, and research focus, concluding with research challenges and promising research directions in ML testing.

  • FedML: A Research Library and Benchmark for Federated Machine Learning

    arXiv (Cornell University) · 2020 · 358 citations

    • Computer Science
    • Computer Science
    • Artificial Intelligence

    Federated learning (FL) is a rapidly growing research field in machine learning. However, existing FL libraries cannot adequately support diverse algorithmic development; inconsistent dataset and model usage make fair algorithm comparison challenging. In this work, we introduce FedML, an open research library and benchmark to facilitate FL algorithm development and fair performance comparison. FedML supports three computing paradigms: on-device training for edge devices, distributed computing, and single-machine simulation. FedML also promotes diverse algorithmic research with flexible and generic API design and comprehensive reference baseline implementations (optimizer, models, and datasets). We hope FedML could provide an efficient and reproducible means for developing and evaluating FL algorithms that would benefit the FL research community. We maintain the source code, documents, and user community at https://fedml.ai.

  • A context-augmented deep learning approach for worker trajectory prediction on unstructured and dynamic construction sites

    Advanced Engineering Informatics · 2020 · 58 citations

    • Computer Science
    • Computer Science
    • Artificial Intelligence

Recent grants

Frequent coauthors

  • Xiaofei Xie

    Singapore Management University

    165 shared
  • Jun Sun

    Singapore Management University

    149 shared
  • Jin Song Dong

    121 shared
  • Maosong Sun

    106 shared
  • Lei Ma

    University of Alberta

    103 shared
  • Felix Juefei-Xu

    New York University

    81 shared
  • Qing Guo

    Agency for Science, Technology and Research

    77 shared
  • Dilek Hakkani‐Tür

    72 shared

Education

  • Ph.D., School of Computing

    National University of Singapore

    2020
  • Bachelor

    National University of Singapore

    2005

Similar researchers at University of Illinois Urbana-Champaign

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Yang Liu

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup