Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Alexander Schwing

Alexander Schwing

· Associate Professor, Electrical and Computer Engineering

University of Illinois Urbana-Champaign · Computer Science

Active 2007–2026

h-index54
Citations12.3k
Papers302141 last 5y
Funding
See your match with Alexander Schwing — sign in to PhdFit.Sign in

About

Alexander Schwing is an Associate Professor in the Electrical and Computer Engineering department at the University of Illinois Urbana-Champaign. His research interests include machine learning and computer vision, with a focus on artificial intelligence. He has taught courses such as Machine Learning and Pattern Recognition, and his work involves developing advanced microscopy techniques to improve in vitro fertilization (IVF) processes, as well as exploring tactics in computer vision aimed at advancing AI development and democratizing new solutions. Schwing has received recognition for his research, and he collaborates on interdisciplinary projects, including joint research initiatives with colleagues from the Hebrew University of Jerusalem to accelerate economic development through innovative technologies.

Research signals

Five dimensions sourced from public faculty / publication signals. Sign in to compare against your own profile and see your match score.

Research topics

  • Artificial Intelligence
  • Computer Science
  • Computer vision
  • Geography
  • Machine Learning
  • Multimedia
  • Engineering
  • Pure mathematics
  • Programming language
  • Mathematics
  • Algorithm

Selected publications

  • Crop Residue Coverage Estimation using Classification from Multi-Season On-Machine Imagery

    SSRN Electronic Journal · 2026-01-01

    preprintOpen access
  • Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective

    ArXiv.org · 2025-03-13

    preprintOpen accessSenior author

    Classifier-free guidance has become a staple for conditional generation with denoising diffusion models. However, a comprehensive understanding of classifier-free guidance is still missing. In this work, we carry out an empirical study to provide a fresh perspective on classifier-free guidance. Concretely, instead of solely focusing on classifier-free guidance, we trace back to the root, i.e., classifier guidance, pinpoint the key assumption for the derivation, and conduct a systematic study to understand the role of the classifier. On 1D data, we find that both classifier guidance and classifier-free guidance achieve conditional generation by pushing the denoising diffusion trajectories away from decision boundaries, i.e., areas where conditional information is usually entangled and is hard to learn. To validate this classifier-centric perspective on high-dimensional data, we assess whether a flow-matching postprocessing step that is designed to narrow the gap between a pre-trained diffusion model's learned distribution and the real data distribution, especially near decision boundaries, can improve the performance. Experiments on various datasets verify our classifier-centric understanding.

  • Variational Rectified Flow Matching

    ArXiv.org · 2025-02-13

    preprintOpen accessSenior author

    We study Variational Rectified Flow Matching, a framework that enhances classic rectified flow matching by modeling multi-modal velocity vector-fields. At inference time, classic rectified flow matching 'moves' samples from a source distribution to the target distribution by solving an ordinary differential equation via integration along a velocity vector-field. At training time, the velocity vector-field is learnt by linearly interpolating between coupled samples one drawn from the source and one drawn from the target distribution randomly. This leads to ''ground-truth'' velocity vector-fields that point in different directions at the same location, i.e., the velocity vector-fields are multi-modal/ambiguous. However, since training uses a standard mean-squared-error loss, the learnt velocity vector-field averages ''ground-truth'' directions and isn't multi-modal. In contrast, variational rectified flow matching learns and samples from multi-modal flow directions. We show on synthetic data, MNIST, CIFAR-10, and ImageNet that variational rectified flow matching leads to compelling results.

  • MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds

    2025-06-10 · 11 citations

    article

    Recent sparse multi-view scene reconstruction advances like DUSt3R and MASt3R no longer require camera calibration and camera pose estimation. However, they only process a pair of views at a time to infer pixel-aligned pointmaps. When dealing with more than two views, a combinatorial number of error prone pairwise reconstructions are usually followed by an expensive global optimization, which often fails to rectify the pairwise reconstruction errors. To handle more views, reduce errors, and improve inference time, we propose the fast single-stage feed-forward network MV- DUSt3R. At its core are multi-view decoder blocks which exchange information across any number of views while considering one reference view. To make our method robust to reference view selection, we further propose MV-DUSt3R+, which employs cross-reference-view blocks to fuse information across different reference view choices. To further enable novel view synthesis, we extend both by adding and jointly training Gaussian splatting heads. Experiments on multi-view stereo reconstruction, multi-view pose estimation, and novel view synthesis confirm that our methods improve significantly upon prior art. Code released.<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>

  • The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation

    2025-10-19

    articleOpen accessSenior author

    Minibatch optimal transport coupling straightens paths in unconditional flow matching. This leads to computationally less demanding inference as fewer integration steps and less complex numerical solvers can be employed when numerically solving an ordinary differential equation at test time. However, in the conditional setting, minibatch optimal transport falls short. This is because the default optimal transport mapping disregards conditions, resulting in a conditionally skewed prior distribution during training. In contrast, at test time, we have no access to the skewed prior, and instead sample from the full, unbiased prior distribution. This gap between training and testing leads to a subpar performance. To bridge this gap, we propose conditional optimal transport C^2OT that adds a conditional weighting term in the cost matrix when computing the optimal transport assignment. Experiments demonstrate that this simple fix works with both discrete and continuous conditions in 8gaussians-to-moons, CIFAR-10, ImageNet-32x32, and ImageNet-256x256. Our method performs better overall compared to the existing baselines across different function evaluation budgets. Code is available at https://hkchengrex.github.io/C2OT

  • 3D-Fixup: Advancing Photo Editing with 3D Priors

    2025-07-23 · 1 citations

    preprintOpen access

    Despite significant advances in modeling image priors via diffusion models, 3D-aware image editing remains challenging, in part because the object is only specified via a single image. To tackle this challenge, we propose 3D-Fixup, a new framework for editing 2D images guided by learned 3D priors. The framework supports difficult editing situations such as object translation and 3D rotation. To achieve this, we leverage a training-based approach that harnesses the generative power of diffusion models. As video data naturally encodes real-world physical dynamics, we turn to video data for generating training data pairs, i.e., a source and a target frame. Rather than relying solely on a single trained model to infer transformations between source and target frames, we incorporate 3D guidance from an Image-to-3D model, which bridges this challenging task by explicitly projecting 2D information into 3D space. We design a data generation pipeline to ensure high-quality 3D guidance throughout training. Results show that by integrating these 3D priors, 3D-Fixup effectively supports complex, identity coherent 3D-aware edits, achieving high-quality results and advancing the application of diffusion models in realistic image manipulation. The code is provided at https://3dfixup.github.io/

  • MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

    2025-06-10 · 13 citations

    article

    We propose to synthesize high-quality and synchronized audio, given video and optional text conditions, using a novel multimodal joint training framework (MMAudio). In contrast to single-modality training conditioned on (limited) video data only, MMAudio is jointly trained with larger-scale, readily available text-audio data to learn to generate semantically aligned high-quality audio samples. Additionally, we improve audio-visual synchrony with a conditional synchronization module that aligns video conditions with audio latents at the frame level. Trained with a flow matching objective, MMAudio achieves new video-to-audio state-of-the-art among public models in terms of audio quality, semantic alignment, and audio-visual synchronization, while having a low inference time (1.23s to generate an 8s clip) and just 157M parameters. MMAudio also achieves surprisingly competitive performance in text-to-audio generation, showing that joint training does not hinder single-modality performance. Code, models, and demo are available at: hkchengrex.github.io/MMAudio.

  • NoPo-Avatar: Generalizable and Animatable Avatars from Sparse Inputs without Human Poses

    ArXiv.org · 2025-11-20

    preprintOpen access

    We tackle the task of recovering an animatable 3D human avatar from a single or a sparse set of images. For this task, beyond a set of images, many prior state-of-the-art methods use accurate "ground-truth" camera poses and human poses as input to guide reconstruction at test-time. We show that pose-dependent reconstruction degrades results significantly if pose estimates are noisy. To overcome this, we introduce NoPo-Avatar, which reconstructs avatars solely from images, without any pose input. By removing the dependence of test-time reconstruction on human poses, NoPo-Avatar is not affected by noisy human pose estimates, making it more widely applicable. Experiments on challenging THuman2.0, XHuman, and HuGe100K data show that NoPo-Avatar outperforms existing baselines in practical settings (without ground-truth poses) and delivers comparable results in lab settings (with ground-truth poses).

  • LIFe-GoM: Generalizable Human Rendering with Learned Iterative Feedback Over Multi-Resolution Gaussians-on-Mesh

    ArXiv.org · 2025-02-13

    preprintOpen access

    Generalizable rendering of an animatable human avatar from sparse inputs relies on data priors and inductive biases extracted from training on large data to avoid scene-specific optimization and to enable fast reconstruction. This raises two main challenges: First, unlike iterative gradient-based adjustment in scene-specific optimization, generalizable methods must reconstruct the human shape representation in a single pass at inference time. Second, rendering is preferably computationally efficient yet of high resolution. To address both challenges we augment the recently proposed dual shape representation, which combines the benefits of a mesh and Gaussian points, in two ways. To improve reconstruction, we propose an iterative feedback update framework, which successively improves the canonical human shape representation during reconstruction. To achieve computationally efficient yet high-resolution rendering, we study a coupled-multi-resolution Gaussians-on-Mesh representation. We evaluate the proposed approach on the challenging THuman2.0, XHuman and AIST++ data. Our approach reconstructs an animatable representation from sparse inputs in less than 1s, renders views with 95.1FPS at $1024 \times 1024$, and achieves PSNR/LPIPS*/FID of 24.65/110.82/51.27 on THuman2.0, outperforming the state-of-the-art in rendering quality.

  • Relocate: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations

    2025-06-10 · 1 citations

    article

    We present Relocate, a simple training-free baseline designed to perform the challenging task of visual query localization in long videos. To eliminate the need for task-specific training and efficiently handle long videos, Relocate leverages a region-based representation derived from pretrained vision models. At a high level, it follows the classic object localization approach: (1) identify all objects in each video frame, (2) compare the objects with the given query and select the most similar ones, and (3) perform bidirectional tracking to get a spatio-temporal response. However, we propose some key enhancements to handle small objects, cluttered scenes, partial visibility, and varying appearances. Notably, we refine the selected objects for accurate localization and generate additional visual queries to capture visual variations. We evaluate Relocate on the challenging Ego4D Visual Query 2D Localization dataset, establishing a new baseline that outperforms prior task-specific methods by 49% (relative improvement) in spatio-temporal average precision.

Frequent coauthors

  • Raquel Urtasun

    31 shared
  • Unnat Jain

    Carnegie Mellon University

    30 shared
  • Raymond A. Yeh

    30 shared
  • Svetlana Lazebnik

    24 shared
  • Zhongzheng Ren

    24 shared
  • Tamir Hazan

    22 shared
  • Yuan-Ting Hu

    Heilongjiang Electric Power Workers University

    19 shared
  • Colin Graber

    18 shared

Labs

  • Siebel School of Computing and Data SciencePI

Education

  • Ph.D., Computer Science

    University of Illinois at Urbana-Champaign

    2005
  • M.S., Computer Science

    University of Illinois at Urbana-Champaign

    2001
  • B.S., Computer Science

    University of Illinois at Urbana-Champaign

    1999

Awards & honors

  • NIH awards Illinois $2.5M to improve IVF with advanced micro…
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Alexander Schwing

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup