Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Mattia Fazzini

Mattia Fazzini

Verified

University of Minnesota · Computer Science and Engineering

Active 2012–2026

h-index14
Citations545
Papers4025 last 5y
Funding
See your match with Mattia Fazzini — sign in to PhdFit.Sign in

About

Mattia Fazzini is an Assistant Professor in the Department of Computer Science & Engineering at the University of Minnesota. He joined the department in 2019 and has a research focus on improving the overall quality of software through the development of techniques for software testing and maintenance. His work also involves investigating software attacks and securing software systems. Fazzini's research aims to advance the field of software engineering by creating innovative solutions for testing, verification, and security challenges. He holds a Ph.D. in Computer Science from Georgia Institute of Technology, obtained in 2019, and has earned multiple master's degrees in related fields from institutions including the University of Illinois at Chicago, Politecnico di Milano, and Politecnico di Torino. His professional background includes serving as a panelist at the National Science Foundation and participating as a program committee member at the IEEE/ACM International Conference on Automated Software Engineering. Fazzini's contributions to software engineering have been recognized through awards such as the IEEE TCSE Distinguished Paper Award and the Facebook Testing and Verification Research Award.

Research topics

  • Computer Science
  • World Wide Web
  • Data science
  • Artificial Intelligence
  • Software engineering
  • Operating system
  • Human–computer interaction

Selected publications

  • Human-Agent versus Human Pull Requests: A Testing-Focused Characterization and Comparison

    ArXiv.org · 2026-01-29

    articleOpen accessSenior author

    AI-based coding agents are increasingly integrated into software development workflows, collaborating with developers to create pull requests (PRs). Despite their growing adoption, the role of human-agent collaboration in software testing remains poorly understood. This paper presents an empirical study of 6,582 human-agent PRs (HAPRs) and 3,122 human PRs (HPRs) from the AIDev dataset. We compare HAPRs and HPRs along three dimensions: (i) testing frequency and extent, (ii) types of testing-related changes (code-and-test co-evolution vs. test-focused), and (iii) testing quality, measured by test smells. Our findings reveal that, although the likelihood of including tests is comparable (42.9% for HAPRs vs. 40.0% for HPRs), HAPRs exhibit a larger extent of testing, nearly doubling the test-to-source line ratio found in HPRs. While test-focused task distributions are comparable, HAPRs are more likely to add new tests during co-evolution (OR=1.79), whereas HPRs prioritize modifying existing tests. Finally, although some test smell categories differ statistically, negligible effect sizes suggest no meaningful differences in quality. These insights provide the first characterization of how human-agent collaboration shapes testing practices.

  • CUDABeaver: Benchmarking LLM-Based Automated CUDA Debugging

    arXiv (Cornell University) · 2026-05-08

    preprintOpen access

    Debugging CUDA programs has long been challenging because failures often arise from subtle interactions among hardware behavior, compiler decisions, memory hierarchy, and asynchronous execution. More importantly, with the rapid expansion of GPU usage across scientific computing, machine learning, graphics, and systems workloads, CUDA debugging has become more challenging than ever. Current evaluations of LLM-based CUDA programming largely miss this setting: a model can pass correctness tests with repair by degeneration, simplifying the CUDA code into a safer but slower program that abandons the original optimization structure. We introduce CUDABEAVER, a benchmark for CUDA debugging from real failing workspaces produced during LLM-based CUDA generation. Each task provides the broken candidate, native build/test commands, raw error evidence, and a single editable file. CUDABEAVER evaluates whether a fixer truly repairs the failing CUDA code or merely finds a slower test-passing replacement, reporting results by failure category, debugging trajectory, stagnation mode, and performance preservation. We further propose pass@k(M,C,A), a protocol-conditional CUDA debugging metric by making the fixer M, corpus C, and protocol axes Aexplicit. Using this metric across 213 tasks and seven frontier LLMs, we show that protocol-aware evaluation gives a more faithful view of CUDA debugging ability: when performance-loss tolerance is high, fixers appear much stronger, but even a minor stricter performance requirement can sharply reduce measured success, shifting scores by up to 40 percentage points.

  • Replication Package for "Improving LLM-Driven Test Generation by Learning from Mocking Information" (AIST 2026)

    Zenodo (CERN European Organization for Nuclear Research) · 2026-04-10

    datasetOpen access

    Improving LLM-Driven Test Generation by Learning from Mocking Information (AIST 2026) — Replication Package Authors:Jamie Lee*, Flynn Teh*, Hengcheng Zhu†, Mengzhen Li‡, Mattia Fazzini‡, Valerio Terragni* Affiliations: * University of Auckland, Auckland, New Zealand† The Hong Kong University of Science and Technology, Hong Kong SAR‡ University of Minnesota, Minneapolis, USA This repository is the replication package for the paper “Improving LLM-Driven Test Generation by Learning from Mocking Information,” presented at AIST 2026. It contains the MOCKMILL tool implementation, raw experimental results, subject programs, and analysis scripts required to reproduce the study. MOCKMILL is a Java-based approach for unit test generation with large language models that leverages mocking information to improve test quality and coverage. The package includes materials for tool setup, experiment execution, and result analysis, and is intended to support transparency, reproducibility, and follow-up research in AI-based software testing. Citation (BibTeX) @inproceedings{lee2026mockmill,title={Improving LLM-Driven Test Generation by Learning from Mocking Information},author={Lee, Jamie and Teh, Flynn and Zhu, Hengcheng and Li, Mengzhen and Fazzini, Mattia and Terragni, Valerio},booktitle={Proceedings of the 19th International Conference on Software Testing, Verification and Validation Workshops (AIST)},year={2026},organization={IEEE}}

  • Human-Agent versus Human Pull Requests: A Testing-Focused Characterization and Comparison

    Open MIND · 2026-01-29

    preprintSenior author

    AI-based coding agents are increasingly integrated into software development workflows, collaborating with developers to create pull requests (PRs). Despite their growing adoption, the role of human-agent collaboration in software testing remains poorly understood. This paper presents an empirical study of 6,582 human-agent PRs (HAPRs) and 3,122 human PRs (HPRs) from the AIDev dataset. We compare HAPRs and HPRs along three dimensions: (i) testing frequency and extent, (ii) types of testing-related changes (code-and-test co-evolution vs. test-focused), and (iii) testing quality, measured by test smells. Our findings reveal that, although the likelihood of including tests is comparable (42.9% for HAPRs vs. 40.0% for HPRs), HAPRs exhibit a larger extent of testing, nearly doubling the test-to-source line ratio found in HPRs. While test-focused task distributions are comparable, HAPRs are more likely to add new tests during co-evolution (OR=1.79), whereas HPRs prioritize modifying existing tests. Finally, although some test smell categories differ statistically, negligible effect sizes suggest no meaningful differences in quality. These insights provide the first characterization of how human-agent collaboration shapes testing practices.

  • Replication Package for "Improving LLM-Driven Test Generation by Learning from Mocking Information" (AIST 2026)

    Zenodo (CERN European Organization for Nuclear Research) · 2026-04-10

    datasetOpen access

    Improving LLM-Driven Test Generation by Learning from Mocking Information (AIST 2026) — Replication Package Authors:Jamie Lee*, Flynn Teh*, Hengcheng Zhu†, Mengzhen Li‡, Mattia Fazzini‡, Valerio Terragni* Affiliations: * University of Auckland, Auckland, New Zealand† The Hong Kong University of Science and Technology, Hong Kong SAR‡ University of Minnesota, Minneapolis, USA This repository is the replication package for the paper “Improving LLM-Driven Test Generation by Learning from Mocking Information,” presented at AIST 2026. It contains the MOCKMILL tool implementation, raw experimental results, subject programs, and analysis scripts required to reproduce the study. MOCKMILL is a Java-based approach for unit test generation with large language models that leverages mocking information to improve test quality and coverage. The package includes materials for tool setup, experiment execution, and result analysis, and is intended to support transparency, reproducibility, and follow-up research in AI-based software testing. Citation (BibTeX) @inproceedings{lee2026mockmill,title={Improving LLM-Driven Test Generation by Learning from Mocking Information},author={Lee, Jamie and Teh, Flynn and Zhu, Hengcheng and Li, Mengzhen and Fazzini, Mattia and Terragni, Valerio},booktitle={Proceedings of the 19th International Conference on Software Testing, Verification and Validation Workshops (AIST)},year={2026},organization={IEEE}}

  • CUDABeaver: Benchmarking LLM-Based Automated CUDA Debugging

    ArXiv.org · 2026-05-08

    articleOpen access

    Debugging CUDA programs has long been challenging because failures often arise from subtle interactions among hardware behavior, compiler decisions, memory hierarchy, and asynchronous execution. More importantly, with the rapid expansion of GPU usage across scientific computing, machine learning, graphics, and systems workloads, CUDA debugging has become more challenging than ever. Current evaluations of LLM-based CUDA programming largely miss this setting: a model can pass correctness tests with repair by degeneration, simplifying the CUDA code into a safer but slower program that abandons the original optimization structure. We introduce CUDABEAVER, a benchmark for CUDA debugging from real failing workspaces produced during LLM-based CUDA generation. Each task provides the broken candidate, native build/test commands, raw error evidence, and a single editable file. CUDABEAVER evaluates whether a fixer truly repairs the failing CUDA code or merely finds a slower test-passing replacement, reporting results by failure category, debugging trajectory, stagnation mode, and performance preservation. We further propose pass@k(M,C,A), a protocol-conditional CUDA debugging metric by making the fixer M, corpus C, and protocol axes Aexplicit. Using this metric across 213 tasks and seven frontier LLMs, we show that protocol-aware evaluation gives a more faithful view of CUDA debugging ability: when performance-loss tolerance is high, fixers appear much stronger, but even a minor stricter performance requirement can sharply reduce measured success, shifting scores by up to 40 percentage points.

  • AndroT: A dataset of Android Apps with Tests

    Zenodo (CERN European Organization for Nuclear Research) · 2026-01-23

    datasetOpen accessSenior author
  • Characterizing Installation- and Run-time Compatibility Issues in Android Benign Apps and Malware

    ACM Transactions on Software Engineering and Methodology · 2025-03-25 · 3 citations

    articleOpen access

    The Android ecosystem has experienced rapid growth, resulting in a diverse range of platforms and devices. This expansion has also brought about compatibility issues that negatively impact user experiences and hinder app development productivity. Existing relevant studies are focused on and limited to the “static” sense of those issues (in terms of potentialities and proneness), while only addressing compatibility issues that possibly occur during app executions. In this article, we present an extensive and longitudinal study on app compatibility issues that are disparate from yet complementary to prior studies, characterizing the incompatibilities based on actual , exercised observations and evidence at both installation and run-time. With a dataset of 74,545 benign apps and 56,919 malicious apps over a span of 12 years (2010 through 2021) and 10 Android versions, we extensively examine the prevalence and symptoms/effects and causes of, as well as the contributing factors to, installation-time and run-time compatibility issues. Our study reveals 12 major novel findings regarding Android app incompatibilities. First (Findings 1, 2), installation-time incompatibilities persisted significantly over the 12 years, even more so in malware than benign apps. Second (Findings 7, 8), run-time compatibility issues were also seen persistently over time but only on specific Android platforms (such as API 26,27, etc.) and much less by malware than benign apps. Third (Findings 5, 6, 11, 12), there is a significant (moderate/stronger) correlation between an app’s specified minSdkVersion and its incompatibilities (over all symptoms and/or with respect to one of its dominating symptom), with stronger correlations seen in malware than in benign apps, for both installation-time and run-time incompatibilities. Similar observations hold (although with much stronger correlation in absolute terms) when considering, instead of the minSdkVersion itself, the gap between the app’s minSdkVersion and the SDK API level of the platform the app is installed to or runs on. Last (Findings 3, 4, 9, 10), installation-time incompatibilities are primarily caused by the utilization of architecture-incompatible native libraries within apps, while run-time incompatibilities are mainly attributed to API changes during the evolution of the Android SDK; the symptoms of run-time failures seen by malware are much more diverse than by benign apps. In addition to these insights, we provide practical recommendations for both app developers and end users on how to effectively address compatibility issues in Android apps, as well as how to devise effective defenses against malware from the compatibility perspectives.

  • Human-Agent versus Human Pull Requests: A Testing-Focused Characterization and Comparison

    Zenodo (CERN European Organization for Nuclear Research) · 2025-12-31

    datasetOpen accessSenior author

    Replication package

  • X2J Object Reconstruction Replication Package

    Zenodo (CERN European Organization for Nuclear Research) · 2025-12-22

    otherOpen access

    This replication package provides complete implementations to fully reproduce the experiments from our paper Reconstructing Objects for Software Testing.

Frequent coauthors

  • Alessandra Gorla

    IMDEA Software

    27 shared
  • Konstantin Kuznetsov

    25 shared
  • Rui Abreu

    25 shared
  • Daniel Dominguez Alvarez

    University of Minnesota

    25 shared
  • Alessandro Orso

    Georgia Institute of Technology

    22 shared
  • John Grundy

    10 shared
  • Tyler Wendland

    University of Minnesota

    8 shared
  • Kevin Moran

    University of Central Florida

    8 shared

Awards & honors

  • Moka: Improving App Testing with Automated Mocking Fazzini,…
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Mattia Fazzini

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup