
Haewon Jeong
· Assistant ProfessorVerifiedUniversity of California, Santa Barbara · Electrical and Computer Engineering
Active 2002–2026
About
Haewon Jeong is an Assistant Professor at the University of California, Santa Barbara (UCSB) in the Electrical and Computer Engineering (ECE) department and is also affiliated with the Computer Science (CS) department. He holds a Ph.D. from Carnegie Mellon University and a Bachelor of Engineering degree from KAIST. Professor Jeong is the principal investigator of the Jeong Lab, which focuses on advancing the fundamental understanding of machine learning algorithms and applying data-driven approaches to solve challenging scientific problems. The lab's research centers around three main themes: reliable computing for machine learning, responsible AI, and machine learning for science. These themes encompass areas such as algorithmic fairness, generative models, information theory, differential privacy, distributed and fault-tolerant computing, quantum computing, and AI applications in scientific domains like astrophysics. Professor Jeong actively contributes to the academic community through teaching courses including Deep Generative Models, Machine Learning, Ethics for Machine Learning, and Probability & Statistics. He also engages in public discourse on ethical AI, exemplified by his public talk at the Pacific Views Lecture Series titled "Ethical AI: Serving Humanity or Falling Short?". His lab fosters collaboration and education, exemplified by initiatives such as the REAL AI Bootcamp for high school students and partnerships with programs like UCSB's School of Scientific Thought.
Research topics
- Biology
- Biochemistry
- Computer Science
- Chemistry
- Microbiology
- Chromatography
- Distributed computing
- Operating system
- Computer architecture
- Botany
- Computational science
- Cell biology
- Medicine
Selected publications
arXiv (Cornell University) · 2026-01-20
articleOpen accessSenior authorFairness and privacy are two vital pillars of trustworthy machine learning. Despite extensive research on these individual topics, their relationship has received significantly less attention. In this paper, we utilize an information-theoretic measure Chernoff Information to characterize the fundamental trade-off between fairness, privacy, and accuracy, as induced by the input data distribution. We first propose Chernoff Difference, a notion of data fairness, along with its noisy variant, Noisy Chernoff Difference, which allows us to analyze both fairness and privacy simultaneously. Through simple Gaussian examples, we show that Noisy Chernoff Difference exhibits three qualitatively distinct behaviors depending on the underlying data distribution. To extend this analysis beyond synthetic settings, we develop the Chernoff Information Neural Estimator (CINE), the first neural network-based estimator of Chernoff Information for unknown distributions. We apply CINE to analyze the Noisy Chernoff Difference on real-world datasets. Together, this work fills a critical gap in the literature by providing a principled, data-dependent characterization of the fairness-privacy interaction.
arXiv (Cornell University) · 2026-01-20
preprintOpen accessSenior authorFairness and privacy are two vital pillars of trustworthy machine learning. Despite extensive research on these individual topics, their relationship has received significantly less attention. In this paper, we utilize an information-theoretic measure Chernoff Information to characterize the fundamental trade-off between fairness, privacy, and accuracy, as induced by the input data distribution. We first propose Chernoff Difference, a notion of data fairness, along with its noisy variant, Noisy Chernoff Difference, which allows us to analyze both fairness and privacy simultaneously. Through simple Gaussian examples, we show that Noisy Chernoff Difference exhibits three qualitatively distinct behaviors depending on the underlying data distribution. To extend this analysis beyond synthetic settings, we develop the Chernoff Information Neural Estimator (CINE), the first neural network-based estimator of Chernoff Information for unknown distributions. We apply CINE to analyze the Noisy Chernoff Difference on real-world datasets. Together, this work fills a critical gap in the literature by providing a principled, data-dependent characterization of the fairness-privacy interaction.
LSTM-Based Network Intrusion Detection System and Solving Data Imbalance Problem Through GAN
2025-02-18 · 5 citations
article1st authorCorrespondingWith the increasing sophistication and variety of cyberattacks in network environments, traditional rule-based intrusion detection systems have proven insufficient to address advanced threats such as Advanced Persistent Threats (APTs). This study presents an LSTM-based network intrusion detection model that incorporates GAN-based oversampling to address the class imbalance issue commonly found in network traffic data sets. By generating synthetic attack samples, the proposed model aims to enhance the anomaly detection performance. Comparative experiments with alternative approaches, including SMOTE and One-Class SVM, demonstrate the strengths and weaknesses of GAN-based oversampling for intrusion detection.
Gone with the Bits: Revealing Racial Bias in Low-Rate Neural Compression for Facial Images
2025-06-22
articleSenior authorNeural compression methods are gaining popularity due to their superior rate-distortion performance over traditional methods, even at extremely low bitrates below 0.1 bpp. As deep learning models, they are prone to bias during the training, potentially leading to unfair outcomes for individuals in different groups. In this paper, we present a scalable framework for evaluating bias in 9 neural image compression models. We first demonstrate that traditional distortion metrics are ineffective in capturing bias in these models. Next, we highlight that racial bias is present in all neural compression models and can be captured by examining facial phenotype degradation in image reconstructions. Finally, we show that utilizing a racially balanced training set can reduce bias but is not a sufficient bias mitigation strategy, since the bias can be attributed to both compression model bias and classification model bias. We believe that this work is a first step towards evaluating and eliminating bias in neural image compression models.
Differentially Private Distributed Mean Estimation with Constrained User Correlations
2025-06-22 · 1 citations
articleSenior authorIn differentially private distributed mean estimation (DP-DME), a central server computes the mean of vectors distributed across <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$n$</tex> users while preserving differential privacy (DP). DP-DME has been studied under various DP models, with distributed DP with secure aggregation and local DP (LDP) being the main models that do not rely on a trusted third party. Distributed DP-based schemes leverage correlated noise among users to achieve higher accuracy than LDP-based schemes, where users operate independently. However, the accuracy of distributed DP comes at the cost of higher communication overhead for generating correlated noise and complex multiround protocols to handle dropouts. In this work, we analyze the communication-accuracy trade-off in distributed DP-DME under arbitrary communication constraints, and propose a method to generate correlated noise strategically within these constraints to enable single-round dropout handling. Our results show that the communication costs of existing distributed DP-DME approaches can be substantially reduced with minimal impact on accuracy.
Gone With the Bits: Revealing Racial Bias in Low-Rate Neural Compression for Facial Images
2025-06-23
articleOpen accessSenior authorNeural compression methods are gaining popularity due to their superior rate-distortion performance over traditional methods, even at extremely low bitrates below 0.1 bpp.As deep learning architectures, these models are prone to bias during the training process, potentially leading to unfair outcomes for individuals in different groups.In this paper, we present a general, structured, scalable framework for evaluating bias in neural image compression models.Using this framework, we investigate racial bias in neural compression algorithms by analyzing nine popular models and their variants.Through this investigation, we first demonstrate that traditional distortion metrics are ineffective in capturing bias in neural compression models.Next, we highlight that racial bias is present in all neural compression models and can be captured by examining facial phenotype degradation in image reconstructions.We then examine the relationship between bias and realism in the decoded images and demonstrate a trade-off across models.Finally, we show that utilizing a racially balanced training set can reduce bias but is not a sufficient bias mitigation strategy.We additionally show the bias can be attributed to compression model bias and classification model bias.We believe that this work is a first step towards evaluating and eliminating bias in neural image compression models.
Alzheimer s Research & Therapy · 2025-08-12 · 1 citations
articleOpen accessBACKGROUND: Alzheimer's disease (AD) is characterized by cognitive decline, amyloid-beta (Aβ) accumulation, and tau hyperphosphorylation. Effective therapies remain limited; therefore, recent studies have explored microRNAs as potential therapeutic targets. METHODS: miR-4536-3p inhibition was investigated using in vitro (SH-SY5Y cells) and in vivo (5xFAD mouse) AD models. Apoptosis, neuronal markers, and signaling pathways were assessed through functional assays. Cognitive effects were evaluated via the Morris water maze. RESULTS: miR-4536-3p inhibition increased an expression of Drebrin1 (DBN1), a key regulator of synaptic plasticity, but it reduced Aβ deposition, tau phosphorylation, and apoptosis. The treatment improved neuronal marker levels and significantly enhanced the spatial learning and memory of 5xFAD mice. Mechanistically, miR-4536-3p inhibition activated the PI3K/Akt/GSK3β signaling pathway, suppressing apoptosis and mitigating AD pathology. CONCLUSION: miR-4536-3p inhibition offers a promising therapeutic strategy for AD by restoring the DBN1 expression, reducing neurodegeneration, and improving cognitive outcomes through PI3K/Akt pathway modulation.
Correlated Privacy Mechanisms for Differentially Private Distributed Mean Estimation
2025-04-09 · 2 citations
articleSenior authorDifferentially private distributed mean estimation (DP-DME) is a fundamental building block in privacy-preserving federated learning, where a central server estimates the mean of d-dimensional vectors held by n users while ensuring <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$(\epsilon, \delta$</tex> - DP. Local differential privacy (LDP) and distributed DP with secure aggregation (SA) are the most common notions of DP used in DP-DME settings with an untrusted server. LDP provides strong resilience to dropouts, colluding users, and adversarial attacks, but suffers from poor utility. In contrast, SA-based DP-DME achieves an <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$O(n)$</tex> utility gain over LDP in DME, but requires increased communication and computation overheads and complex multi-round protocols to handle dropouts and attacks. In this work, we present a generalized framework for DP-DME, that captures LDP and SA-based mechanisms as extreme cases. Our framework provides a foundation for developing and analyzing a variety of DP-DME protocols that leverage correlated privacy mechanisms across users. To this end, we propose CorDP-DME, a novel DP-DME mechanism based on the correlated Gaussian mechanism, that spans the gap between DME with LDP and distributed DP. We prove that CorDP-DME offers a favorable balance between utility and resilience to dropout and collusion. We provide an information-theoretic analysis of CorDP-DME, and derive theoretical guarantees for utility under any given privacy parameters and dropout/colluding user thresholds. Our results demonstrate that (anti) correlated Gaussian DP mechanisms can significantly improve utility in mean estimation tasks compared to LDP - even in adversarial settings - while maintaining better resilience to dropouts and attacks compared to distributed DP.
When Machine Learning Gets Personal: Evaluating Prediction and Explanation
ArXiv.org · 2025-02-05
preprintOpen accessIn high-stakes domains like healthcare, users often expect that sharing personal information with machine learning systems will yield tangible benefits, such as more accurate diagnoses and clearer explanations of contributing factors. However, the validity of this assumption remains largely unexplored. We propose a unified framework to quantify how personalizing a model influences both prediction and explanation. We show that its impacts on prediction and explanation can diverge: a model may become more or less explainable even when prediction is unchanged. For practical settings, we study a standard hypothesis test for detecting personalization effects on demographic groups. We derive a finite-sample lower bound on its probability of error as a function of group sizes, number of personal attributes, and desired benefit from personalization. This provides actionable insights, such as which dataset characteristics are necessary to test an effect, or the maximum effect that can be tested given a dataset. We apply our framework to real-world tabular datasets using feature-attribution methods, uncovering scenarios where effects are fundamentally untestable due to the dataset statistics. Our results highlight the need for joint evaluation of prediction and explanation in personalized models and the importance of designing models and datasets with sufficient information for such evaluation.
Research Square · 2025-06-03
preprintOpen access1st authorCorresponding
Frequent coauthors
- 21 shared
Pulkit Grover
Carnegie Mellon University
- 14 shared
Viveck R. Cadambe
- 11 shared
Flávio P. Calmon
Harvard University
- 9 shared
Sanghamitra Dutta
North East Institute of Science and Technology
- 7 shared
Yaoqing Yang
- 6 shared
Ateet Devulapalli
- 6 shared
Tze Meng Low
Carnegie Mellon University
- 4 shared
Farzin Haddadpour
Labs
Develops generative and trustworthy AI methods for physics, cosmology, and complex systems, bridging machine learning and the natural sciences.
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Haewon Jeong
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup