Joseph Chen
VerifiedUniversity of California, San Diego · Astronomy and Astrophysics
Active 1991–2026
Research topics
- Immunology
- Cell biology
- Genetics
- Biology
- Endocrinology
Selected publications
CellMaster: Collaborative Cell Type Annotation in Single-Cell Analysis
arXiv (Cornell University) · 2026-02-12
articleOpen accessSingle-cell RNA-seq (scRNA-seq) enables atlas-scale profiling of complex tissues, revealing rare lineages and transient states. Yet, assigning biologically valid cell identities remains a bottleneck because markers are tissue- and state-dependent, and novel states lack references. We present CellMaster, an AI agent that mimics expert practice for zero-shot cell-type annotation. Unlike existing automated tools, CellMaster leverages LLM-encoded knowledge (e.g., GPT-4o) to perform on-the-fly annotation with interpretable rationales, without pre-training or fixed marker databases. Across 9 datasets spanning 8 tissues, CellMaster improved accuracy by 7.1% over best-performing baselines (including CellTypist and scTab) in automatic mode. With human-in-the-loop refinement, this advantage increased to 18.6%, with a 22.1% gain on subtype populations. The system demonstrates particular strength in rare and novel cell states where baselines often fail. Source code and the web application are available at \href{https://github.com/AnonymousGym/CellMaster}{https://github.com/AnonymousGym/CellMaster}.
scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery
arXiv (Cornell University) · 2026-02-12
articleOpen accessWe present scPilot, the first systematic framework to practice omics-native reasoning: a large language model (LLM) converses in natural language while directly inspecting single-cell RNA-seq data and on-demand bioinformatics tools. scPilot converts core single-cell analyses, i.e., cell-type annotation, developmental-trajectory reconstruction, and transcription-factor targeting, into step-by-step reasoning problems that the model must solve, justify, and, when needed, revise with new evidence. To measure progress, we release scBench, a suite of 9 expertly curated datasets and graders that faithfully evaluate the omics-native reasoning capability of scPilot w.r.t various LLMs. Experiments with o1 show that iterative omics-native reasoning lifts average accuracy by 11% for cell-type annotation and Gemini-2.5-Pro cuts trajectory graph-edit distance by 30% versus one-shot prompting, while generating transparent reasoning traces explain marker gene ambiguity and regulatory logic. By grounding LLMs in raw omics data, scPilot enables auditable, interpretable, and diagnostically informative single-cell analyses. Code, data, and package are available at https://github.com/maitrix-org/scPilot
CellMaster: Collaborative Cell Type Annotation in Single-Cell Analysis
Open MIND · 2026-02-12
preprintSingle-cell RNA-seq (scRNA-seq) enables atlas-scale profiling of complex tissues, revealing rare lineages and transient states. Yet, assigning biologically valid cell identities remains a bottleneck because markers are tissue- and state-dependent, and novel states lack references. We present CellMaster, an AI agent that mimics expert practice for zero-shot cell-type annotation. Unlike existing automated tools, CellMaster leverages LLM-encoded knowledge (e.g., GPT-4o) to perform on-the-fly annotation with interpretable rationales, without pre-training or fixed marker databases. Across 9 datasets spanning 8 tissues, CellMaster improved accuracy by 7.1% over best-performing baselines (including CellTypist and scTab) in automatic mode. With human-in-the-loop refinement, this advantage increased to 18.6%, with a 22.1% gain on subtype populations. The system demonstrates particular strength in rare and novel cell states where baselines often fail. Source code and the web application are available at \href{https://github.com/AnonymousGym/CellMaster}{https://github.com/AnonymousGym/CellMaster}.
scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery
Open MIND · 2026-02-12
preprintWe present scPilot, the first systematic framework to practice omics-native reasoning: a large language model (LLM) converses in natural language while directly inspecting single-cell RNA-seq data and on-demand bioinformatics tools. scPilot converts core single-cell analyses, i.e., cell-type annotation, developmental-trajectory reconstruction, and transcription-factor targeting, into step-by-step reasoning problems that the model must solve, justify, and, when needed, revise with new evidence. To measure progress, we release scBench, a suite of 9 expertly curated datasets and graders that faithfully evaluate the omics-native reasoning capability of scPilot w.r.t various LLMs. Experiments with o1 show that iterative omics-native reasoning lifts average accuracy by 11% for cell-type annotation and Gemini-2.5-Pro cuts trajectory graph-edit distance by 30% versus one-shot prompting, while generating transparent reasoning traces explain marker gene ambiguity and regulatory logic. By grounding LLMs in raw omics data, scPilot enables auditable, interpretable, and diagnostically informative single-cell analyses. Code, data, and package are available at https://github.com/maitrix-org/scPilot
bioRxiv (Cold Spring Harbor Laboratory) · 2025-08-29
preprintOpen accessSenior authorCorrespondingBACKGROUND Activating Transcription Factor 4 (ATF4) functions as a transcriptional regulator in various cell types and tissues under both physiological and pathological conditions. While previous studies have linked ATF4 activation with promoting cardiomyocyte (CM) death in dilated cardiomyopathy (DCM), atrial fibrillation, and heart failure, its role in developing CMs remains unexplored. METHODS We generated multiple distinct CM-specific ( Atf4 cKO(e2/3/pA) and Atf4 cKO(e2) ) and global Atf4 knockout ( Atf4 7del/7del and Atf4 1ins/1ins ) mouse models targeting different Atf4 regions, as well as cardiomyocyte-specific deletion of Rps19bp1 to study cardiac phenotypes. Detailed morphological and molecular analyses were performed. RESULTS Atf4 cKO( e2/3 /pA) (targeting exon 2-3 including the polyadenylation signal (polyA)) mice exhibited severe cardiac defects and died before E17.5, likely due to ectopic activation of p53 signaling pathway resulting from Rps19bp1 downregulation, a potent suppressor of p53. Further investigation revealed that deleting the polyA signal of Atf4 in Atf4 cKO(e2/3/pA) mice led to transcriptional readthrough, resulting in the formation of an Atf4 - Cacna1i fusion transcript and Rps19bp1 downregulation. To avoid readthrough while abolishing ATF4 function, we introduced small indels into exon 3 of Atf4 in mice ( Atf4 7del/7del and Atf4 1ins/1ins ), which showed normal Rps19bp1 expression and cardiac morphology. Importantly, CM-specific deletion of Rps19bp1 recapitulated the cardiac defects and transcriptional change seen in Atf4 cKO(e 2 /3/pA) mice. CONCLUSIONS We found that the downregulation of Rps19bp1 , not loss of ATF4 function, underlying the cardiac phenotypes in Atf4 cKO(e2/3/pA) mice. The reduced expression of Rps19bp1 in Atf4 cKO(e2/3/pA) mice is likely due to the unintentional deletion of Atf4 polyA signal and subsequent transcriptional readthrough, underscoring the essential role of RPS19BP1, not ATF4, in cardiac development. Consistent Rps19bp1 downregulation has been observed in other tissue-specific Atf4 knockout models utilizing the Atf4 fl(e2/3/pA) allele, suggesting that previously reported Atf4 KO phenotypes may result from Atf4 transcriptional readthrough effects. These findings reveal a locus-dependent transcriptional interference mechanism and emphasize the importance of avoiding confounding cis effects in genetically engineered models. TRANSLATIONAL PERSPECTIVE Our findings clarify ATF4’s role in heart development by showing that cardiac defects in cardiomyocyte-specific ATF4 knockout mice—using a widely employed floxed ATF4 line—result from unintended downregulation of RPS19BP1 caused by transcriptional readthrough. This shifts the focus from ATF4 to RPS19BP1, a key regulator of p53 activity, as a potential driver of cardiac developmental abnormalities. Clinically, these insights caution against misinterpretation of genetic knockout models and highlight RPS19BP1 as a promising target for congenital heart disease and related cardiac dysfunctions, with potential implications for future therapies.
Cardiovascular Research · 2025-11-14 · 2 citations
articleOpen accessSenior authorAIMS: Activating transcription factor 4 (ATF4) functions as a transcriptional regulator in various cell types and tissues under both physiological and pathological conditions. While previous studies have linked ATF4 activation with promoting cardiomyocyte (CM) death in dilated cardiomyopathy (DCM), atrial fibrillation, and heart failure, its role in developing CMs remains unexplored. METHODS AND RESULTS: We generated multiple distinct CM-specific (Atf4cKO(e2/3/pA) and Atf4cKO(e2)) and global Atf4 knockout (KO; Atf47del/7del and Atf41ins/1ins) mouse models targeting different Atf4 regions, as well as CM-specific deletion of Rps19bp1 to study cardiac phenotypes. Detailed morphological and molecular analyses were performed. Atf4cKO(e2/3/pA) [targeting exon 2-3 including the polyadenylation signal (polyA)] mice exhibited severe cardiac defects and died before E17.5, likely due to ectopic activation of the p53 signaling pathway resulting from Rps19bp1 downregulation, a potent suppressor of p53. Further investigation revealed that deleting the polyA signal of Atf4 in Atf4cKO(e2/3/pA) mice led to transcriptional readthrough, resulting in the formation of an Atf4-Cacna1i fusion transcript and Rps19bp1 downregulation. To avoid readthrough while abolishing ATF4 function, we introduced small indels into exon 3 of Atf4 in mice (Atf47del/7del and Atf41ins/1ins), which showed normal Rps19bp1 expression and cardiac morphology. Importantly, CM-specific deletion of Rps19bp1 recapitulated the cardiac defects and transcriptional change seen in Atf4cKO(e2/3/pA) mice. CONCLUSION: We found that the downregulation of Rps19bp1, not the loss of ATF4 function, underlies the cardiac phenotypes in Atf4cKO(e2/3/pA) mice. The reduced expression of Rps19bp1 in Atf4cKO(e2/3/pA) mice is likely due to the unintentional deletion of Atf4 polyA signal and subsequent transcriptional readthrough, underscoring the essential role of RPS19BP1, not ATF4, in cardiac development. Consistent Rps19bp1 downregulation has been observed in other tissue-specific Atf4 KO models utilizing the Atf4fl(e2/3/pA) allele, suggesting that previously reported Atf4 KO phenotypes may result from Atf4 transcriptional readthrough effects. These findings reveal a locus-dependent transcriptional interference mechanism and emphasize the importance of avoiding confounding cis effects in genetically engineered models.
Separation and Purification Technology · 2025-11-06 · 3 citations
article1st authorFOLDER: Accelerating Multi-Modal Large Language Models with Enhanced Performance
2025-10-19
preprintOpen accessRecently, Multi-modal Large Language Models (MLLMs) have shown remarkable effectiveness for multi-modal tasks due to their abilities to generate and understand cross-modal data. However, processing long sequences of visual tokens extracted from visual backbones poses a challenge for deployment in real-time applications. To address this issue, we introduce FOLDER, a simple yet effective plug-and-play module designed to reduce the length of the visual token sequence, mitigating both computational and memory demands during training and inference. Through a comprehensive analysis of the token reduction process, we analyze the information loss introduced by different reduction strategies and develop FOLDER to preserve key information while removing visual redundancy. We showcase the effectiveness of FOLDER by integrating it into the visual backbone of several MLLMs, significantly accelerating the inference phase. Furthermore, we evaluate its utility as a training accelerator or even performance booster for MLLMs. In both contexts, FOLDER achieves comparable or even better performance than the original models, while dramatically reducing complexity by removing up to 70% of visual tokens.
Clinical and Experimental Medicine · 2025-08-11 · 3 citations
articleOpen accessPresently, no specific therapies have been recognized for immunoglobulin A nephropathy (IgAN). Mycophenolate mofetil (MMF) has been verified effective for Chinese patients with IgAN. Telitacicept is a full-human TACI-FC fusion preventing B cells maturation and activation, and it has been proven to be beneficial for IgAN in a phase II clinical trial. This study was designed to observe the efficacy and safety of telitacicept plus low-dose MMF for IgAN treatment. This retrospective cohort study included 24 patients with IgAN, and patients were treated with telitacicept plus MMF. The primary outcome was settled as the changing in proteinuria and estimated glomerular filtration rate (eGFR). The subordinate outcome was set as the changing in hematuria. The mean follow-up time was 23 months. The median baseline proteinuria was 2.5 (1.74, 6.58) g/d, and eGFR was 94.97 (56.8, 120.67) mL/min/1.73 m2. There were noteworthy reductions in proteinuria at 3, 6, 9, 12, 15, 18, 21 and 24 months when compared to the baseline levels [1.45 (0.78, 1.8) g/d [p = 0.0122], 0.505 (0.26, 0.99) g/d [p < 0.0001], 0.48 (0.28, 0.76) g/d [p < 0.0001], 0.3 (0.17, 0.85) g/d [p < 0.0001], 0.23 (0.18, 0.575) g/d [p < 0.0001], 0.18 (0.12, 0.325) g/d [p < 0.0001], 0.14 (0.105, 0.22) g/d [p < 0.0001] and 0.14 (0.103, 0.278) g/d [p < 0.0001]]. All patients maintained stable eGFR during follow-up times. Besides, telitacicept plus MMF remarkably alleviated the hematuria. Telitacicept plus MMF treatment led to not only remarkable clinically significant reduction in proteinuria and hematuria, but also stable serum creatinine value of patients with IgAN without adverse side effects.
Journal of Building Engineering · 2025-08-05 · 1 citations
articleCorresponding
Recent grants
BAG3 in Cardiac function and disease
NIH · $1.6M · 2016–2019
Training in Cardiovascular Physiology and Pharmacology
NIH · $13.3M · 1979–2028
NIH · $36.2M · 2014
NIH · $5.3M · 2016
NIH · $1.3M · 2017
Frequent coauthors
- 119 shared
Sylvia Μ. Evans
University of California, San Diego
- 117 shared
Nancy D. Dalton
- 106 shared
Kirk L. Peterson
University of California, San Diego
- 85 shared
Yusu Gu
University of California, San Diego
- 62 shared
Julius Bogomolovas
- 57 shared
Kenneth R. Chien
Karolinska Institutet
- 54 shared
Kirk U. Knowlton
Intermountain Medical Center
- 53 shared
Paola Cattaneo
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Joseph Chen
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup