
Jayaram K. Udupa
VerifiedUniversity of Pennsylvania · Rehabilitation Medicine
Active 1980–2026
About
Jayaram K. Udupa, Ph.D., is a Professor of Radiologic Science in Radiology and the Chief of the Medical Imaging Section within the Department of Radiology at the University of Pennsylvania. His research expertise encompasses medical imaging, image processing and analysis, 3D visualization, machine learning, deep learning, and hybrid intelligence. He focuses on the application of image analysis in various disease conditions, including cancer, respiratory restrictive conditions, orthopedics, neuro disorders, cardiovascular issues, radiation therapy planning, and kinematics of joints. Dr. Udupa is associated with the Medical Image Processing Group at the University of Pennsylvania, where he contributes to advancing medical imaging technologies and their clinical applications.
Research topics
- Computer science
- Artificial intelligence
- Computer vision
- Medicine
- Computer graphics (images)
Selected publications
arXiv (Cornell University) · 2026-03-26
preprintOpen accessAccurate lesion segmentation is essential in medical image analysis, yet most existing methods are designed for specific anatomical sites or imaging modalities, limiting their generalizability. Recent vision-language foundation models enable concept-driven segmentation in natural images, offering a promising direction for more flexible medical image analysis. However, concept-prompt-based lesion segmentation, particularly with the latest Segment Anything Model 3 (SAM3), remains underexplored. In this work, we present a systematic evaluation of SAM3 for lesion segmentation. We assess its performance using geometric bounding boxes and concept-based text and image prompts across multiple modalities, including multiparametric MRI, CT, ultrasound, dermoscopy, and endoscopy. To improve robustness, we incorporate additional prior knowledge, such as adjacent-slice predictions, multiparametric information, and prior annotations. We further compare different fine-tuning strategies, including partial module tuning, adapter-based methods, and full-model optimization. Experiments on 13 datasets covering 11 lesion types demonstrate that SAM3 achieves strong cross-modality generalization, reliable concept-driven segmentation, and accurate lesion delineation. These results highlight the potential of concept-based foundation models for scalable and practical medical image segmentation. Code and trained models will be released at: https://github.com/apple1986/lesion-sam3
2026-04-03
articleEarly relapse of diffuse large B-cell lymphoma (DLBCL) within 12 months of diagnosis confers a poor prognosis, underscoring the critical need for accurate pretreatment prediction. Current approaches relying on radiomics and clinical indicators are limited by scarce medical imaging data, time-consuming manual feature extraction (susceptible to human subjectivity), and insufficient use of multimodal deep learning frameworks. To address these challenges, we propose MMDLBCL, a novel multimodal transfer learning framework for pretreatment 12-month disease-free survival (DFS12) prediction. It leverages the biomedical pre-trained PMC-CLIP model to mitigate small-sample constraints, uses pretreatment FDG-PET/CT scans to generate coronal and sagittal Maximum Intensity Projection (MIP) images (focusing on metabolically active lesions), and converts key clinical covariates (e.g., International Prognostic Index (IPI) score, disease stage, serum lactate dehydrogenase (LDH) levels) into structured text embeddings. The architecture integrates an enhanced ResNet50 visual encoder (for dual-plane MIP feature extraction), a PubMedBERT text encoder (for clinical data processing), a Medical-Statistical Feature Fusion (MSFF) module (to quantify tumor heterogeneity via higher-order statistics), a Multi-Scaled Cooperative Attention (MSCA) module (to align imaging and clinical features), and a Cross- Modal Contrastive Head (CMCH) module (to optimize discriminative power via triple contrastive loss). Trained and validated on a retrospective cohort of 338 DLBCL adult patients, MM-DLBCL achieved an end-to-end AUC of 0.8316 ± 0.06 and accuracy of 0.8044 ± 0.05 for DFS12 prediction, demonstrating significant potential for multimodal transfer learning in automated DLBCL disease-free survival risk stratification.
2026-04-01
articleSenior authorExploiting DINOv3-Based Self-Supervised Features for Robust Few-Shot Medical Image Segmentation
ArXiv.org · 2026-01-12
articleOpen accessDeep learning-based automatic medical image segmentation plays a critical role in clinical diagnosis and treatment planning but remains challenging in few-shot scenarios due to the scarcity of annotated training data. Recently, self-supervised foundation models such as DINOv3, which were trained on large natural image datasets, have shown strong potential for dense feature extraction that can help with the few-shot learning challenge. Yet, their direct application to medical images is hindered by domain differences. In this work, we propose DINO-AugSeg, a novel framework that leverages DINOv3 features to address the few-shot medical image segmentation challenge. Specifically, we introduce WT-Aug, a wavelet-based feature-level augmentation module that enriches the diversity of DINOv3-extracted features by perturbing frequency components, and CG-Fuse, a contextual information-guided fusion module that exploits cross-attention to integrate semantic-rich low-resolution features with spatially detailed high-resolution features. Extensive experiments on six public benchmarks spanning five imaging modalities, including MRI, CT, ultrasound, endoscopy, and dermoscopy, demonstrate that DINO-AugSeg consistently outperforms existing methods under limited-sample conditions. The results highlight the effectiveness of incorporating wavelet-domain augmentation and contextual fusion for robust feature representation, suggesting DINO-AugSeg as a promising direction for advancing few-shot medical image segmentation. Code and data will be made available on https://github.com/apple1986/DINO-AugSeg.
ArXiv.org · 2026-03-26
articleOpen accessAccurate lesion segmentation is essential in medical image analysis, yet most existing methods are designed for specific anatomical sites or imaging modalities, limiting their generalizability. Recent vision-language foundation models enable concept-driven segmentation in natural images, offering a promising direction for more flexible medical image analysis. However, concept-prompt-based lesion segmentation, particularly with the latest Segment Anything Model 3 (SAM3), remains underexplored. In this work, we present a systematic evaluation of SAM3 for lesion segmentation. We assess its performance using geometric bounding boxes and concept-based text and image prompts across multiple modalities, including multiparametric MRI, CT, ultrasound, dermoscopy, and endoscopy. To improve robustness, we incorporate additional prior knowledge, such as adjacent-slice predictions, multiparametric information, and prior annotations. We further compare different fine-tuning strategies, including partial module tuning, adapter-based methods, and full-model optimization. Experiments on 13 datasets covering 11 lesion types demonstrate that SAM3 achieves strong cross-modality generalization, reliable concept-driven segmentation, and accurate lesion delineation. These results highlight the potential of concept-based foundation models for scalable and practical medical image segmentation. Code and trained models will be released at: https://github.com/apple1986/lesion-sam3
2026-02-12
article2026-02-13
articleSenior authorExploiting DINOv3-Based Self-Supervised Features for Robust Few-Shot Medical Image Segmentation
arXiv (Cornell University) · 2026-01-12
preprintOpen accessDeep learning-based automatic medical image segmentation plays a critical role in clinical diagnosis and treatment planning but remains challenging in few-shot scenarios due to the scarcity of annotated training data. Recently, self-supervised foundation models such as DINOv3, which were trained on large natural image datasets, have shown strong potential for dense feature extraction that can help with the few-shot learning challenge. Yet, their direct application to medical images is hindered by domain differences. In this work, we propose DINO-AugSeg, a novel framework that leverages DINOv3 features to address the few-shot medical image segmentation challenge. Specifically, we introduce WT-Aug, a wavelet-based feature-level augmentation module that enriches the diversity of DINOv3-extracted features by perturbing frequency components, and CG-Fuse, a contextual information-guided fusion module that exploits cross-attention to integrate semantic-rich low-resolution features with spatially detailed high-resolution features. Extensive experiments on six public benchmarks spanning five imaging modalities, including MRI, CT, ultrasound, endoscopy, and dermoscopy, demonstrate that DINO-AugSeg consistently outperforms existing methods under limited-sample conditions. The results highlight the effectiveness of incorporating wavelet-domain augmentation and contextual fusion for robust feature representation, suggesting DINO-AugSeg as a promising direction for advancing few-shot medical image segmentation. Code and data will be made available on https://github.com/apple1986/DINO-AugSeg.
Anatomy orientation‑guided auto‑segmentation of bones from hip 3T magnetic resonance imaging
2026-02-12
article2026-04-01
articleSenior authorWe present a technical framework for integrating breathing frequency into regional volumetric analysis of free-breathing 4D dynamic MRI in pediatric populations. Regional quantitative MRI endpoints traditionally focused on tidal volume amplitudes but did not capture breathing rate, which varies substantially with age and respiratory state. We describe implementation of frequency normalized regional metrics (RR/TV ratios) alongside standard regional volumetric measures, using Z-score standardization against age-matched reference data. This approach is demonstrated in 47 pediatric thoracic insufficiency syndrome (TIS) patients with paired pre/post-VEPTR surgery scans, compared to 200 healthy controls. Technical components include respiratory rate derived during 4D reconstruction via the OFx method, regional volumetry of lung compartments, and reference space normalization using the Virtual Growing Child (VGC) normative database. We show that frequency weighted outcomes exhibit stronger developmental trends (Spearman ρ = -0.54 for lung RR/TV vs. +0.40 to +0.44 for lung volumes) and larger effect sizes when tracking surgical changes (r = 0.54–0.64 for RR/TV vs. r = 0.66–0.83 for TV). The framework can be integrated into existing 4D-MRI pipelines with minimal overhead and provide a more complete characterization of breathing mechanics in pediatric chest wall disorders.
Recent grants
NIH · $2.3M · 2021–2026
Virtual growing child 5-dimensional functional models for treating respiratory anomalies
NIH · $2.3M · 2020–2024
NIH · $600k · 1995
Automated Object Contouring Methods & Software for Radiotherapy Planning
NIH · $1.9M · 2016–2021
NIH · $461k · 2018
Frequent coauthors
- 162 shared
Drew A. Torigian
University of Pennsylvania
- 132 shared
Yubing Tong
California University of Pennsylvania
- 100 shared
Dewey Odhner
California University of Pennsylvania
- 49 shared
Robert I. Grossman
Hospital of the University of Pennsylvania
- 48 shared
Punam K. Saha
University of Iowa
- 48 shared
Joseph M. McDonough
Children's Hospital of Philadelphia
- 48 shared
George J. Grevera
Saint Joseph's University
- 46 shared
Caiyun Wu
First Affiliated Hospital of Anhui Medical University
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Jayaram K. Udupa
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup