
Wenxin Jiang
· Professor of Statistics and Data ScienceVerifiedNorthwestern University · Statistics
Active 1997–2025
About
Wenxin Jiang is a Professor of Statistics and Data Science at Northwestern University. He earned his Ph.D. in 1996 from Cornell University. His research interests include mathematical statistics, biostatistics, data mining, Bayesian statistics, econometrics, and statistical applications in social sciences. Jiang has contributed to the field through various publications, including work on ecological regression with partial identification, asymptotic distributions and confidence intervals for LIFT measures in data mining, and inequalities for Gibbs posterior with nonadditive empirical risk. His academic and research activities focus on advancing statistical methodologies and their applications across diverse domains.
Research topics
- Computer Science
- Genetics
- Artificial Intelligence
- Biology
- Medicine
- Pathology
- Algorithm
- Chemistry
- Molecular biology
- Microbiology
- Virology
- Mathematics
- Chromatography
- Mathematical optimization
- Theoretical computer science
Selected publications
ArXiv.org · 2025-07-28
preprintOpen access1st authorCorrespondingValley polarization and altermagnetism are two emerging fundamental phenomena in condensed matter physics, offering unprecedented opportunites for information encoding and processing in novel energy-efficient devices. By coupling valley and spin degrees of freedom with ferroic orders such as ferroelectricity, nonvolatile memory functionalities can be achieved. Here, we propose a way to realize ferroelectric-valley (FE-valley) and FE-altermagnetic coupling in a bilayer antiferromagnetic (AFM) honeycomb lattices based on an effective four-band spin-full $k\cdot p$ model. Our proposal is validated in bilayer MnPTe$_3$ through first-principles calculations. A spontaneous out-of-plane electric polarization occurs in AB- (BA-) stacking configuration, which is reversibly switchable via interlayer sliding. Remarkably, polarization reversal simultaneously inverts both layer-resolved valley polarization and altermagnetic spin splitting. This dual control enables tunable layer-spin-locked anomalous valley Hall effects and an unprecedented magnetoelectric response in 2D antiferromagnets. Our work establishes a general paradigm for electrically programmable valleytronic and spintronic functionalities of 2D AFM materials.
PickleBall: Secure Deserialization of Pickle-based Machine Learning Models (Extended Report)
ArXiv.org · 2025-08-21
preprintOpen accessMachine learning model repositories such as the Hugging Face Model Hub facilitate model exchanges. However, bad actors can deliver malware through compromised models. Existing defenses such as safer model formats, restrictive (but inflexible) loading policies, and model scanners have shortcomings: 44.9% of popular models on Hugging Face still use the insecure pickle format, 15% of these cannot be loaded by restrictive loading policies, and model scanners have both false positives and false negatives. Pickle remains the de facto standard for model exchange, and the ML community lacks a tool that offers transparent safe loading. We present PickleBall to help machine learning engineers load pickle-based models safely. PickleBall statically analyzes the source code of a given machine learning library and computes a custom policy that specifies a safe load-time behavior for benign models. PickleBall then dynamically enforces the policy during load time as a drop-in replacement for the pickle module. PickleBall generates policies that correctly load 79.8% of benign pickle-based models in our dataset, while rejecting all (100%) malicious examples in our dataset. In comparison, evaluated model scanners fail to identify known malicious models, and the state-of-art loader loads 22% fewer benign models than PickleBall. PickleBall removes the threat of arbitrary function invocation from malicious pickle-based models, raising the bar for attackers to depend on code reuse techniques.
Nano Letters · 2025-10-31 · 2 citations
articleThe in-plane anomalous Hall effect (IPAHE) and magneto-optical Kerr effect (MOKE) have emerged as crucial functionalities in spintronics, yet their realization and control in two-dimensional (2D) magnetic systems remain challenging due to stringent symmetry constraints. In this study, based on symmetry analysis and first-principles calculations, we explore a general framework to achieve and modulate IPAHE and MOKE in 2D magnetic bilayers via interlayer sliding and spin-orientation engineering. Using ferromagnetic (FM) CrPSe4 and antiferromagnetic (AFM) MPSe3 (M = Mn and Cr) as prototype systems, we demonstrate that the modification of the stacking order and spin orientation can selectively manipulate symmetries, controlling the presence and sign of IPAHE and MOKE. Our findings establish a symmetry-protected coupling between spin, stacking order, and electronic response, providing a practical approach to achieve tunable IPAHE/MOKE. This work opens promising avenues for the development of next-generation magneto-optical devices and spintronic memory applications with enhanced functionality.
A chain mediation model of parent child relationship and academic burnout of adolescents
Scientific Reports · 2025-03-03 · 5 citations
articleOpen accessThis paper studied the relationship and mechanisms of parent-child relationship, interpersonal relationship on campus, academic self-efficacy and academic burnout among adolescents. A study of 913 Chinese junior high school students from Fujian province (47.20% males, mean age = 13.99 years, SD = 0.81) was conducted using the Junior Middle School Students' Learning Weariness Scale, the Chinese version of parent-child affinity scale, the Loso Wellbeing Questionnaire, and the Academic Self-efficacy Questionnaire. (1) Academic burnout was negatively and significantly correlated with parent-child relationship (r = - 0.13, p < 0.01), interpersonal relationship on campus (r = - 0.11, p < 0.01), and academic self-efficacy (r = - 0.13, p < 0.01). Parent-child relationship was positively and significantly correlated with interpersonal relationship on campus (r = 0.23, p < 0.01) and academic self-efficacy (r = 0.38, p < 0.01). Interpersonal relationship and academic self-efficacy were positively and significantly correlated (r = 0.29, p < 0.01). (2) Parent-child relationship can significantly and negatively predicte academic burnout (β = - 0.082, p < 0.05). (3) Parent-child relationship affected academic burnout of adolescents via three significant indirect effects: the single mediating effect of interpersonal relationship on campus (effect = - 0.011) and academic self-efficacy (effect = - 0.019), and the chain mediating effect of interpersonal relationship on campus and academic self-efficacy (effect = - 0.003). Stronger parent-child relationship predicts lower levels of academic burnout. Moreover, parent-child relationship can indirectly affect academic burnout not only through the single mediating effect of interpersonal relationship on campus and academic self-efficacy but also through the chain mediating effect of interpersonal relationship on campus and academic self-efficacy.
Confidence Intervals for Evaluation of Data Mining
ArXiv.org · 2025-02-10
preprintOpen accessSenior authorIn data mining, when binary prediction rules are used to predict a binary outcome, many performance measures are used in a vast array of literature for the purposes of evaluation and comparison. Some examples include classification accuracy, precision, recall, F measures, and Jaccard index. Typically, these performance measures are only approximately estimated from a finite dataset, which may lead to findings that are not statistically significant. In order to properly quantify such statistical uncertainty, it is important to provide confidence intervals associated with these estimated performance measures. We consider statistical inference about general performance measures used in data mining, with both individual and joint confidence intervals. These confidence intervals are based on asymptotic normal approximations and can be computed fast, without needs to do bootstrap resampling. We study the finite sample coverage probabilities for these confidence intervals and also propose a `blurring correction' on the variance to improve the finite sample performance. This 'blurring correction' generalizes the plus-four method from binomial proportion to general performance measures used in data mining. Our framework allows multiple performance measures of multiple classification rules to be inferred simultaneously for comparisons.
AgentHub: A Registry for Discoverable, Verifiable, and Reproducible AI Agents
ArXiv.org · 2025-10-03
preprintOpen accessLLM-based agents are rapidly proliferating, yet the infrastructure for discovering, evaluating, and governing them remains fragmented compared to mature ecosystems like software package registries (e.g., npm) and model hubs (e.g., Hugging Face). Existing efforts typically address naming, distribution, or protocol descriptors, but stop short of providing a registry layer that makes agents discoverable, comparable, and governable under automated reuse. We present AgentHub, a registry layer and accompanying research agenda for agent sharing that targets discovery and workflow integration, trust and security, openness and governance, ecosystem interoperability, lifecycle transparency, and capability clarity with evidence. We describe a reference prototype that implements a canonical manifest with publish-time validation, version-bound evidence records linked to auditable artifacts, and an append-only lifecycle event log whose states are respected by default in search and resolution. We also provide initial discovery results using an LLM-as-judge recommendation pipeline, showing how structured contracts and evidence improve intent-accurate retrieval beyond keyword-driven discovery. AgentHub aims to provide a common substrate for building reliable, reusable agent ecosystems.
Deacetylation-induced aggregates of konjac glucomannan: Evaluating its potential as food emulsifier
Food Hydrocolloids · 2025-06-19
article1st authorLubricating performance of pectin: Influence from the colloidal structures
Food Hydrocolloids · 2025-11-20 · 2 citations
article1st authorFrontiers in Genetics · 2024-05-23 · 3 citations
articleOpen accessBackground: Pre-eclampsia is a pregnancy-related disorder characterized by hypertension and proteinuria, severely affecting the health and quality of life of patients. However, the molecular mechanism of macrophages in pre-eclampsia is not well understood. Methods: In this study, the key biomarkers during the development of pre-eclampsia were identified using bioinformatics analysis. The GSE75010 and GSE74341 datasets from the GEO database were obtained and merged for differential analysis. A weighted gene co-expression network analysis (WGCNA) was constructed based on macrophage content, and machine learning methods were employed to identify key genes. Immunoinfiltration analysis completed by the CIBERSORT method, R package "ClusterProfiler" to explore functional enrichment of these intersection genes, and potential drug predictions were conducted using the CMap database. Lastly, independent analysis of protein levels, localization, and quantitative analysis was performed on placental tissues collected from both preeclampsia patients and healthy control groups. Results: We identified 70 differentially expressed NETs genes and found 367 macrophage-related genes through WGCNA analysis. Machine learning identified three key genes: FNBP1L, NMUR1, and PP14571. These three key genes were significantly associated with immune cell content and enriched in multiple signaling pathways. Specifically, these genes were upregulated in PE patients. These findings establish the expression patterns of three key genes associated with M2 macrophage infiltration, providing potential targets for understanding the pathogenesis and treatment of PE. Additionally, CMap results suggested four potential drugs, including Ttnpb, Doxorubicin, Tyrphostin AG 825, and Tanespimycin, which may have the potential to reverse pre-eclampsia. Conclusion: Studying the expression levels of three key genes in pre-eclampsia provides valuable insights into the prevention and treatment of this condition. We propose that these genes play a crucial role in regulating the maternal-fetal immune microenvironment in PE patients, and the pathways associated with these genes offer potential avenues for exploring the molecular mechanisms underlying preeclampsia and identifying therapeutic targets. Additionally, by utilizing the Connectivity Map database, we identified drug targets like Ttnpb, Doxorubicin, Tyrphostin AG 825, and Tanespimycin as potential clinical treatments for preeclampsia.
Automatic Parallel Tempering Markov Chain Monte Carlo with Nii-C
arXiv (Cornell University) · 2024-07-13
preprintOpen accessDue to the high dimensionality or multimodality that is common in modern astronomy, sampling Bayesian posteriors can be challenging. Several publicly available codes based on different sampling algorithms can solve these complex models, but the execution of the code is not always efficient or fast enough. The article introduces a C language general-purpose code, Nii-C (https://github.com/shengjin/nii-c.git), that implements a framework of Automatic Parallel Tempering Markov Chain Monte Carlo. Automatic in this context means that the parameters that ensure an efficient parallel tempering process can be set by a control system during the initial stages of a sampling process. The auto-tuned parameters consist of two parts, the temperature ladders of all parallel tempering Markov chains and the proposal distributions for all model parameters across all parallel tempering chains. In order to reduce dependencies in the compilation process and increase the code's execution speed, Nii-C code is constructed entirely in the C language and parallelised using the Message-Passing Interface protocol to optimise the efficiency of parallel sampling. These implementations facilitate rapid convergence in the sampling of high-dimensional and multi-modal distributions, as well as expeditious code execution time. The Nii-C code can be used in various research areas to trace complex distributions due to its high sampling efficiency and quick execution speed. This article presents a few applications of the Nii-C code.
Frequent coauthors
- 55 shared
Pingzhao Hu
Western University
- 25 shared
Qin Kuang
Tan Kah Kee Innovation Laboratory
- 24 shared
Svetlana Frenkel
George & Fay Yee Centre for Healthcare Innovation
- 23 shared
Martin A. Tanner
- 17 shared
Stephen W. Scherer
SickKids Foundation
- 15 shared
Çharles N. Bernstein
- 13 shared
Gary King
Harvard University Press
- 12 shared
Malik Peiris
University of Hong Kong
Education
- 2009
Ph.D., Statistics
University of California, Berkeley
- 2005
M.S., Statistics
University of California, Berkeley
- 2003
B.A., Mathematics
University of California, Los Angeles
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Wenxin Jiang
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup