Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Jie Gao

Jie Gao

· Associate Professor, Ph.D., 2012, Columbia UniversityVerified

Stony Brook University · Mechanical Engineering

Active 1999–2025

h-index39
Citations5.5k
Papers24563 last 5y
Funding$2.4M
See your match with Jie Gao — sign in to PhdFit.Sign in

About

Exploring enhanced light-matter interactions with optical, thermal and quantum nanomaterials, structures and devices. Our multidisciplinary research bridges the researchers across the fields of optical engineering, mechanical engineering, electrical engineering, applied physics, and materials science.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Mathematics
  • Combinatorics
  • Metallurgy
  • Materials science
  • Mathematical optimization
  • Discrete mathematics
  • Geometry
  • Theoretical computer science
  • Algorithm
  • Geology
  • Geotechnical engineering
  • Composite material

Selected publications

  • Correlation-aware Online Change Point Detection

    2025-11-07

    articleOpen access

    Change point detection aims to identify abrupt shifts occurring at multiple points within a data sequence. This task becomes particularly challenging in the online setting, where different types of change can occur, including shifts in both the marginal and joint distributions of the data. In this paper, we address these challenges by tracking the Riemannian geometry of correlation matrices, allowing Riemannian metrics to compute the geodesic distance as an accurate measure of correlation dynamics.

  • Patterns and drivers of organic carbon sequestration efficiency within soil aggregates under straw return across the croplands of China

    European Journal of Agronomy · 2025-11-12 · 2 citations

    article
  • Understanding Endogenous Data Drift in Adaptive Models with Recourse-Seeking Users

    Proceedings of the AAAI/ACM Conference on AI Ethics and Society · 2025-10-15

    preprintOpen access

    Deep learning models are widely used in decision-making and recommendation systems, where they typically rely on the assumption of a static data distribution between training and deployment. However, real-world deployment environments often violate this assumption. Users who receive negative outcomes may adapt their features to meet model criteria, i.e., recourse action. These adaptive behaviors create shifts in the data distribution and when models are retrained on this shifted data, a feedback loop emerges: user behavior influences the model, and the updated model in turn reshapes future user behavior. Despite its importance, this bidirectional interaction between users and models has received limited attention. In this work, we develop a general framework to model user strategic behaviors and their interactions with decision-making systems under resource constraints and competitive dynamics. Both the theoretical and empirical analyses show that user recourse behavior tends to push logistic and MLP models toward increasingly higher decision standards, resulting in higher recourse costs and less reliable recourse actions over time. To mitigate these challenges, we propose two methods—Fair-top-k and Dynamic Continual Learning (DCL)—which significantly reduce recourse cost and improve model robustness. Our findings draw connections to economic theories, highlighting how algorithmic decision-making can unintentionally reinforce a higher standard and generate endogenous barriers to entry.

  • Maximizing Truth Learning in a Social Network is NP-hard

    arXiv (Cornell University) · 2025-02-18

    preprintOpen accessSenior author

    Sequential learning models situations where agents predict a ground truth in sequence, by using their private, noisy measurements, and the predictions of agents who came earlier in the sequence. We study sequential learning in a social network, where agents only see the actions of the previous agents in their own neighborhood. The fraction of agents who predict the ground truth correctly depends heavily on both the network topology and the ordering in which the predictions are made. A natural question is to find an ordering, with a given network, to maximize the (expected) number of agents who predict the ground truth correctly. In this paper, we show that it is in fact NP-hard to answer this question for a general network, with both the Bayesian learning model and a simple majority rule model. Finally, we show that even approximating the answer is hard.

  • Enhancing Heterogeneous Information Networks Through Student Interactions for Knowledge Tracing

    SSRN Electronic Journal · 2025-01-01

    preprintOpen accessSenior author
  • Randomized Dimensionality Reduction for Euclidean Maximization and Diversity Measures

    ArXiv.org · 2025-05-30

    preprintOpen access1st authorCorresponding

    Randomized dimensionality reduction is a widely-used algorithmic technique for speeding up large-scale Euclidean optimization problems. In this paper, we study dimension reduction for a variety of maximization problems, including max-matching, max-spanning tree, max TSP, as well as various measures for dataset diversity. For these problems, we show that the effect of dimension reduction is intimately tied to the \emph{doubling dimension} $λ_X$ of the underlying dataset $X$ -- a quantity measuring intrinsic dimensionality of point sets. Specifically, we prove that a target dimension of $O(λ_X)$ suffices to approximately preserve the value of any near-optimal solution,which we also show is necessary for some of these problems. This is in contrast to classical dimension reduction results, whose dependence increases with the dataset size $|X|$. We also provide empirical results validating the quality of solutions found in the projected space, as well as speedups due to dimensionality reduction.

  • The Discrepancy of Shortest Paths

    arXiv (Cornell University) · 2024-01-01

    preprintOpen access

    The hereditary discrepancy of a set system is a certain quantitative measure of the pseudorandom properties of the system. Roughly, hereditary discrepancy measures how well one can $2$-color the elements of the system so that each set contains approximately the same number of elements of each color. Hereditary discrepancy has well-studied applications e.g. in communication complexity and derandomization. More recently, the hereditary discrepancy of set systems of shortest paths has found applications in differential privacy [Chen et al.~SODA 23]. The contribution of this paper is to improve the upper and lower bounds on the hereditary discrepancy of set systems of unique shortest paths in graphs. In particular, we show that any system of unique shortest paths in an undirected weighted graph has hereditary discrepancy $\widetilde{O}(n^{1/4})$, and we construct lower bound examples demonstrating that this bound is tight up to hidden $\text{polylog } n$ factors. Our lower bounds apply even in the planar and bipartite settings, and they improve on a previous lower bound of $Ω(n^{1/6})$ obtained by applying the trace bound of Chazelle and Lvov [SoCG'00] to a classical point-line system of Erdős. As applications, we improve the lower bound on the additive error for differentially-private all pairs shortest distances from $Ω(n^{1/6})$ [Chen et al.~SODA 23] to $Ω(n^{1/4})$, and we improve the lower bound on additive error for the differentially-private all sets range queries problem to $Ω(n^{1/4})$, which is tight up to hidden $\text{polylog } n$ factors [Deng et al.~WADS 23].

  • Community detection in the human connectome: Method types, differences and their impact on inference

    Human Brain Mapping · 2024-03-29 · 6 citations

    articleOpen access

    Abstract Community structure is a fundamental topological characteristic of optimally organized brain networks. Currently, there is no clear standard or systematic approach for selecting the most appropriate community detection method. Furthermore, the impact of method choice on the accuracy and robustness of estimated communities (and network modularity), as well as method‐dependent relationships between network communities and cognitive and other individual measures, are not well understood. This study analyzed large datasets of real brain networks (estimated from resting‐state fMRI from = 5251 pre/early adolescents in the adolescent brain cognitive development [ABCD] study), and = 5338 synthetic networks with heterogeneous, data‐inspired topologies, with the goal to investigate and compare three classes of community detection methods: (i) modularity maximization‐based (Newman and Louvain), (ii) probabilistic (Bayesian inference within the framework of stochastic block modeling (SBM)), and (iii) geometric (based on graph Ricci flow). Extensive comparisons between methods and their individual accuracy (relative to the ground truth in synthetic networks), and reliability (when applied to multiple fMRI runs from the same brains) suggest that the underlying brain network topology plays a critical role in the accuracy, reliability and agreement of community detection methods. Consistent method (dis)similarities, and their correlations with topological properties, were estimated across fMRI runs. Based on synthetic graphs, most methods performed similarly and had comparable high accuracy only in some topological regimes, specifically those corresponding to developed connectomes with at least quasi‐optimal community organization. In contrast, in densely and/or weakly connected networks with difficult to detect communities, the methods yielded highly dissimilar results, with Bayesian inference within SBM having significantly higher accuracy compared to all others. Associations between method‐specific modularity and demographic, anthropometric, physiological and cognitive parameters showed mostly method invariance but some method dependence as well. Although method sensitivity to different levels of community structure may in part explain method‐dependent associations between modularity estimates and parameters of interest, method dependence also highlights potential issues of reliability and reproducibility. These findings suggest that a probabilistic approach, such as Bayesian inference in the framework of SBM, may provide consistently reliable estimates of community structure across network topologies. In addition, to maximize robustness of biological inferences, identified network communities and their cognitive, behavioral and other correlates should be confirmed with multiple reliable detection methods.

  • Anemia Preintervention: A Predictive Analytics–Based Clinical Decision Support System

    NEJM Catalyst · 2024-03-14

    article
  • Enabling Asymptotic Truth Learning in a Social Network

    arXiv (Cornell University) · 2024-10-06

    preprintOpen accessSenior author

    Consider a network of agents that all want to guess the correct value of some ground truth state. In a sequential order, each agent makes its decision using a single private signal which has a constant probability of error, as well as observations of actions from its network neighbors earlier in the order. We are interested in enabling \emph{network-wide asymptotic truth learning} -- that in a network of $n$ agents, almost all agents make a correct prediction with probability approaching one as $n$ goes to infinity. In this paper we study both random orderings and carefully crafted decision orders with respect to the graph topology as well as sufficient or necessary conditions for a graph to support such a good ordering. We first show that on a sparse graph of average constant degree with a random ordering asymptotic truth learning does not happen. We then show a rather modest sufficient condition to enable asymptotic truth learning. With the help of this condition we characterize graphs generated from the Erdös Rényi model and preferential attachment model. In an Erdös Rényi graph, unless the graph is super sparse (with $O(n)$ edges) or super dense (nearly a complete graph), there exists a decision ordering that supports asymptotic truth learning. Similarly, any preferential attachment network with a constant number of edges per node can achieve asymptotic truth learning under a carefully designed ordering but not under either a random ordering nor the arrival order. We also evaluated a variant of the decision ordering on different network topologies and demonstrated clear effectiveness in improving truth learning over random orderings.

Recent grants

Frequent coauthors

Labs

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Jie Gao

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup