Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Rong Ge

Rong Ge

· Cue Family Associate Professor of Computer ScienceVerified

Duke University · Computer Science

Active 2004–2025

h-index41
Citations7.4k
Papers22052 last 5y
Funding$2.0M
See your match with Rong Ge — sign in to PhdFit.Sign in

About

Rong Ge is the Cue Family Associate Professor in the Computer Science Department at Duke University. He earned his Ph.D. from the Computer Science Department of Princeton University under the supervision of Sanjeev Arora. Following his doctoral studies, he was a postdoctoral researcher at Microsoft Research in New England. His research broadly spans theoretical computer science and machine learning, with a focus on understanding and formalizing hidden structures in data and designing efficient algorithms to uncover them. He studies problems arising in the analysis of text, images, and other data types, employing techniques such as non-convex optimization and tensor decompositions. His work aims to provide provable algorithms for machine learning problems, contributing to the theoretical foundations of modern machine learning methods including deep learning.

Research topics

  • Computer Science
  • Parallel computing
  • Artificial Intelligence
  • Computer Security
  • Embedded system
  • Operating system
  • Computer architecture

Selected publications

  • GALE: Leveraging Heterogeneous Systems for Efficient Unstructured Mesh Data Analysis

    IEEE Transactions on Visualization and Computer Graphics · 2025-12-05

    article

    Unstructured meshes present challenges in scientific data analysis due to irregular distribution and complex connectivity. Computing and storing connectivity information is a major bottleneck for visualization algorithms, affecting both time and memory performance. Recent task-parallel data structures address this by precomputing connectivity information at runtime while the analysis algorithm executes, effectively hiding computation costs and improving performance. However, existing approaches are CPU-bound, forcing the data structure and analysis algorithm to compete for the same computational resources, limiting potential speedups. To overcome this limitation, we introduce a novel task-parallel approach optimized for heterogeneous CPU-GPU systems. Specifically, we offload the computation of mesh connectivity information to GPU threads, enabling CPU threads to focus on executing the visualization algorithm. Following this paradigm, we propose GPU-Aided Localized data structurE (GALE), the first open-source CUDA-based data structure designed for heterogeneous task parallelism. Experiments on two 20-core CPUs and an NVIDIA V100 GPU show that GALE achieves up to $2.7\times$ speedup over state-of-the-art localized data structures while maintaining memory efficiency.

  • Is In-Context Learning Feasible for HPC Performance Autotuning?

    2025-06-03

    article

    We examine whether in-context learning with Large Language Models (LLMs) can effectively address the challenges of High-Performance Computing (HPC) autotuning. LLMs have demonstrated remarkable natural language processing and artificial intelligence (AI) capabilities, sparking interest in their application across various domains, including HPC. Performance autotuning – the process of automatically optimizing system configurations to maximize efficiency through empirical evaluation – offers significant promise for enhancing application performance on larger systems and emerging architectures. However, this process remains computationally expensive due to the combinatorial explosion of configuration parameters and the complex, nonlinear relationships between configurations and performance outcomes.We pose a critical question: Can LLMs, without task-specific fine-tuning, accurately infer performance-configuration patterns by combining in-context examples with latent knowledge? To explore this, we leverage empirical performance data from real-world HPC systems, designing structured prompts and queries to evaluate LLMs’ capabilities. Our experiments reveal inherent limitations in applying in-context learning to performance autotuning, particularly for tasks requiring precise mathematical reasoning and analysis of complex multivariate dependencies. We provide empirical evidence of these shortcomings and discuss potential research directions to overcome these challenges.

  • Design and Practice of the "Ideological and Political Education in Courses" Teaching System for Maritime English

    International Journal of Education and Humanities · 2025-01-23 · 1 citations

    articleOpen accessSenior author

    Against the backdrop of the continuous advancement of globalization and educational modernization, the reform of the "Ideological and Political Education in Navigation English" teaching system has become an urgent task. This paper deeply analyzes the current situation of "Ideological and Political Education in Navigation English" teaching, and points out the existing problems such as the lack of ideological and political teaching content, the old and single teaching method, and the weak theoretical foundation of ideological and political education for teachers. In response to these problems, the design of the teaching system adheres to the principles of fostering virtue through education, integrating knowledge and moral education, being practice-oriented and continuously improving, and comprehensively covers key links such as needs analysis and goal setting, teaching content planning, selection and implementation of teaching methods, construction of evaluation system, construction of teaching staff and guarantee of teaching resources. Through practical tests, this teaching system has significantly improved students' professional knowledge level and language skills, and has achieved remarkable results in the cultivation of ideological and political quality and professional competence, continuously supply high - quality talents to the field of navigation and effectively promoted the innovation and change of the Navigation English course.

  • AskHPC: A ChatBot for High Performance Computing User Support

    2025-11-07 · 1 citations

    articleOpen accessSenior author

    High-Performance Computing (HPC) systems in the exascale era are increasingly heterogeneous, requiring users to navigate diverse tools, configurations, and best practices. However, essential information is often scattered across fragmented, multimodal documentation, making it difficult and time-consuming to locate. To address this, we present AskHPC, an intelligent question-answering ChatBot that delivers accurate, timely, and accessible information through a unified conversational interface. Built on a curated knowledge base integrating user guides, scheduler manuals, and programming documentation, AskHPC leverages Large Language Models (LLMs) within a Retrieval-Augmented Generation (RAG) framework. It employs two key techniques to improve HPC query responses: a modality-aware document parsing pipeline that preserves multimodal structure, and a dual-context strategy combining retrieved content (e.g., complete code blocks) with LLM-generated semantics. Evaluation, including a real-world user study, shows AskHPC outperforms direct LLM queries and vanilla RAG systems, enhancing user support and accelerating HPC software development.

  • HELM: Characterizing Unified Memory Accesses to Improve GPU Performance under Memory Oversubscription

    2025-11-12 · 2 citations

    articleOpen accessSenior author

    Unified Memory (UM) technologies simplify memory management across CPU and GPU domains in GPU-accelerated heterogeneous architectures through transparent data migration. However, the default migration mechanism can severely degrade performance when applications oversubscribe GPU memory. Existing approaches to mitigating this performance degradation often fail to generalize, as they target specific application types, require specialized hardware, or integrate opaque classification methods.

  • Mapping Annual Forest Disturbance from 1986 to 2021 at 30-M Resolution in China Using the Modified Cold Algorithm

    SSRN Electronic Journal · 2025-01-01 · 1 citations

    preprintOpen access
  • Method for Recognition of Communication Interference Signals under Small-Sample Conditions

    Applied Sciences · 2024-07-04 · 1 citations

    articleOpen access1st author

    To address the difficulty in obtaining a large number of labeled jamming signals in complex electromagnetic environments, this paper proposes a small-sample communication jamming signal recognition method based on WDCGAN-SA (Wasserstein Deep Convolution Generative Adversarial Network–Self Attention) and C-ResNet (Convolution Block Attention Module–Residual Network). Firstly, leveraging the DCGAN architecture, we integrate the Wasserstein distance measurement and gradient penalty mechanism to design the jamming signal generation model WDCGAN for data augmentation. Secondly, we introduce a self-attention mechanism to make the generation model focus on global correlation features in time–frequency maps while optimizing training strategies to enhance the quality of generated samples. Finally, real samples are mixed with generated samples and fed into the classification network, incorporating cross-channel and spatial information in the classification network to improve jamming signal recognition rates. The simulation results demonstrate that under small-sample conditions with a Jamming-to-Noise Ratio (JNR) ranging from −10 dB to 10 dB, the proposed algorithm significantly outperforms GAN, WGAN and DCGAN comparative algorithms in recognizing six types of communication jamming signals.

  • For Better or For Worse? Learning Minimum Variance Features With Label Augmentation

    arXiv (Cornell University) · 2024-02-10

    preprintOpen accessSenior author

    Data augmentation has been pivotal in successfully training deep learning models on classification tasks over the past decade. An important subclass of data augmentation techniques - which includes both label smoothing and Mixup - involves modifying not only the input data but also the input label during model training. In this work, we analyze the role played by the label augmentation aspect of such methods. We first prove that linear models on binary classification data trained with label augmentation learn only the minimum variance features in the data, while standard training (which includes weight decay) can learn higher variance features. We then use our techniques to show that even for nonlinear models and general data distributions, the label smoothing and Mixup losses are lower bounded by a function of the model output variance. Lastly, we demonstrate empirically that this aspect of label smoothing and Mixup can be a positive and a negative. On the one hand, we show that the strong performance of label smoothing and Mixup on image classification benchmarks is correlated with learning low variance hidden representations. On the other hand, we show that Mixup and label smoothing can be more susceptible to low variance spurious correlations in the training data.

  • Vendor-neutral and Production-grade Job Power Management in High Performance Computing

    2024-11-17 · 3 citations

    articleSenior author

    Power management and energy efficiency are critical research areas for exascale computing and beyond, necessitating reliable telemetry and control for distributed systems. Despite this need, existing approaches present several limitations precluding their adoption in production. These limitations include, but are not limited to, lack of portability due to vendor-specific and closed-source solutions, lack of support for non-MPI applications, and lack of user-level customization.We present a job-level power management framework based on Flux. We introduce flux-power-monitor and demonstrate its effectiveness on the Lassen (IBM Power AC922) and Tioga (HPE Cray EX235A) systems with a low average overhead of 0.4%. We also present flux-power-manager, where we discuss a proportional sharing policy and introduce a hierarchical FFT-based dynamic power management algorithm (FPP). We demonstrate that FPP reduces energy by 1% compared to proportional sharing, and by 20% compared to the default IBM static power capping policy.

  • ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods

    arXiv (Cornell University) · 2024-06-23

    preprintOpen access

    The rapid scaling of large language models (LLMs) has raised concerns about the transparency and fair use of the data used in their pretraining. Detecting such content is challenging due to the scale of the data and limited exposure of each instance during training. We propose ReCaLL (Relative Conditional Log-Likelihood), a novel membership inference attack (MIA) to detect LLMs' pretraining data by leveraging their conditional language modeling capabilities. ReCaLL examines the relative change in conditional log-likelihoods when prefixing target data points with non-member context. Our empirical findings show that conditioning member data on non-member prefixes induces a larger decrease in log-likelihood compared to non-member data. We conduct comprehensive experiments and show that ReCaLL achieves state-of-the-art performance on the WikiMIA dataset, even with random and synthetic prefixes, and can be further improved using an ensemble approach. Moreover, we conduct an in-depth analysis of LLMs' behavior with different membership contexts, providing insights into how LLMs leverage membership information for effective inference at both the sequence and token level.

Recent grants

Frequent coauthors

Labs

  • Rong Ge LabPI

    The Rong Ge Lab focuses on theoretical computer science and machine learning, particularly in analyzing text, images, and other forms of data using techniques such as non-convex optimization and tensor decompositions.

Awards & honors

  • Collaborative Reseach: Transferable, Hierarchical, Expressiv…
  • NSF CAREER: Optimization Landscape for Non-convex Functions…
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Rong Ge

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup