Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Matthew Turk

Matthew Turk

· Professor EmeritiVerified

University of Illinois Urbana-Champaign · Interdisciplinary Computing and the Arts

Active 1985–2026

h-index42
Citations9.9k
Papers23755 last 5y
Funding$3.6M
See your match with Matthew Turk — sign in to PhdFit.Sign in

About

The research in our lab uses advanced data science techniques to understand how water, plants, geology and climate interact in a tightly coupled system – and how humans are changing this system.

Research topics

  • Astrophysics
  • Physics
  • Astronomy

Selected publications

  • <tt>libyt</tt> : An In Situ Interface Connecting Simulations with <tt>yt</tt> , Python, and Jupyter Workflows

    The Astrophysical Journal Supplement Series · 2026-05-01

    articleOpen accessSenior author

    Abstract In the exascale computing era, handling and analyzing massive data sets have become extremely challenging. In situ analysis, which processes data during simulation runtime and bypasses costly intermediate disk input and output steps, offers a promising solution. We present libyt ( https://github.com/yt-project/libyt ), an open-source C library that enables astrophysical simulations to analyze and visualize data in parallel computation with yt or other Python packages. libyt can invoke Python routines automatically or provide interactive entry points via a Python prompt or a Jupyter Notebook. It requires minimal intervention in researchers’ workflows, allowing users to reuse job submission scripts and Python routines. We describe libyt ’s architecture for parallel computing in high-performance computing environments, including its bidirectional connection between simulation codes and Python, and its integration into the Jupyter ecosystem. We detail its methods for reading patch-based adaptive mesh refinement simulations and handling in-memory data with minimal overhead, and procedures for yielding data when requested by Python. We describe how libyt maps simulation data to yt front ends, allowing postprocessing scripts to be converted into in situ analysis with just two lines of change. We document libyt ’s application programming interface (API) and demonstrate its integration into two astrophysical simulation codes, GAMER and Enzo , using examples including core-collapse supernovae, isolated dwarf galaxies, fuzzy dark matter, the Sod shock tube test, Kelvin–Helmholtz instability, and the AGORA galaxy simulation. Finally, we discuss libyt ’s performance, limitations related to data redistribution, extensibility, architecture, and comparisons with traditional postprocessing approaches.

  • Now More Than Ever, Foundational AI Research and Infrastructure Depends on the Federal Government

    ArXiv.org · 2025-06-17

    preprintOpen access

    Leadership in the field of AI is vital for our nation's economy and security. Maintaining this leadership requires investments by the federal government. The federal investment in foundation AI research is essential for U.S. leadership in the field. Providing accessible AI infrastructure will benefit everyone. Now is the time to increase the federal support, which will be complementary to, and help drive, the nation's high-tech industry investments.

  • Spezi Data Pipeline: Streamlining FHIR-based Interoperable Digital Health Data Workflows

    ArXiv.org · 2025-09-17

    preprintOpen access

    The increasing adoption of digital health technologies has amplified the need for robust, interoperable solutions to manage complex healthcare data. We present the Spezi Data Pipeline, an open-source Python toolkit designed to streamline the analysis of digital health data, from secure access and retrieval to processing, visualization, and export. The Pipeline is integrated into the larger Stanford Spezi open-source ecosystem for developing research and translational digital health software systems. Leveraging HL7 FHIR-based data representations, the pipeline enables standardized handling of diverse data types--including sensor-derived observations, ECG recordings, and clinical questionnaires--across research and clinical environments. We detail the modular system architecture and demonstrate its application using real-world data from the PAWS at Stanford University, in which the pipeline facilitated efficient extraction, transformation, and clinician-driven review of Apple Watch ECG data, supporting annotation and comparative analysis alongside traditional monitors. By reducing the need for bespoke development and enhancing workflow efficiency, the Spezi Data Pipeline advances the scalability and interoperability of digital health research, ultimately supporting improved care delivery and patient outcomes.

  • Developing Library and Data Storytelling Toolkits: Scenarios and Personas

    Lecture notes in computer science · 2024-01-01

    book-chapterSenior author
  • Why does the Milky Way have a metallicity floor?

    Monthly Notices of the Royal Astronomical Society · 2024-07-19 · 4 citations

    articleOpen access

    ABSTRACT The prevalence of light element enhancement in the most metal-poor stars is potentially an indication that the Milky Way has a metallicity floor for star formation around $\sim 10^{-3.5}$ Z$_{\odot }$. We propose that this metallicity floor has its origins in metal-enriched star formation in the minihaloes present during the Galaxy’s initial formation. To arrive at this conclusion, we analyse a cosmological radiation hydrodynamics simulation that follows the concurrent evolution of multiple Population III star-forming minihaloes. The main driver for the central gas within minihaloes is the steady increase in hydrostatic pressure as the haloes grow. We incorporate this insight into a hybrid one-zone model that switches between pressure-confined and modified free-fall modes to evolve the gas density with time according to the ratio of the free-fall and sound-crossing time-scales. This model is able to accurately reproduce the density and chemo-thermal evolution of the gas in each of the simulated minihaloes up to the point of runaway collapse. We then use this model to investigate how the gas responds to the absence of H$_{2}$. Without metals, the central gas becomes increasingly stable against collapse as it grows to the atomic cooling limit. When metals are present in the halo at a level of $\sim 10^{-3.7}$ Z$_{\odot }$, however, the gas is able to achieve gravitational instability while still in the minihalo regime. Thus, we conclude that the Galaxy’s metallicity floor is set by the balance within minihaloes of gas-phase metal cooling and the radiation background associated with its early formation environment.

  • Libyt: A Tool for Parallel In Situ Analysis with yt, Python, and Jupyter

    2024-05-15 · 3 citations

    articleOpen accessSenior author

    In the era of extreme-scale computing, large-scale data storage and analysis have become more critical and challenging. For postprocessing, the simulation first needs to dump snapshots on a hard disk before processing any data. This becomes a bottleneck for high spatial and temporal resolution simulation. In situ analysis provides a viable solution for analyzing extreme scale simulations by processing data in memory, which skips the step of storing data on disk. We present libyt, an open-source C library that allows researchers to analyze and visualize data using yt or other Python packages in parallel computing during simulation runtime. We describe the code method for connecting simulation runtime data to Python, handling data transition and redistribution between Python and simulation processes with minimal memory overhead, and supporting interactive Python prompt and Jupyter Notebook for users to probe the ongoing simulation data at the current time step. We demonstrate how it solves the problem of visualizing large-scale astrophysical simulations, improving disk usage efficiency, and monitoring simulations closely. We conclude it with discussions and compare libyt to post-processing.

  • Teaching data storytelling as data literacy

    Information and Learning Sciences · 2024-04-29 · 6 citations

    articleSenior author

    Purpose Data storytelling courses position students as agents in creating stories interpreted from data about a social problem or social justice issue. The purpose of this study is to explore two research questions: What themes characterized students’ iterative development of data story topics? Looking back at six years of iterative feedback, what categories of data literacy pedagogy did instructors engage for these themes?. Design/methodology/approach This project examines six years of data storytelling final projects using thematic analysis and three years of instructor feedback. Ten themes in final projects align with patterns in feedback. Reflections on pedagogical approaches to students’ topic development suggest extending data literacy pedagogy categories – formal, personal and folk (Pangrazio and Sefton-Green, 2020). Findings Data storytelling can develop students’ abilities to move from being consumers to creators of data and interpretations. The specific topic of personal data exposure or risk has presented some challenges for data literacy instruction (Bowler et al., 2017). What “personal” means in terms of data should be defined more broadly. Extending the data literacy pedagogy categories of formal, personal and folk (Pangrazio and Sefton-Green, 2020) could more effectively center social justice in data literacy instruction. Practical implications Implications for practice include positioning students as producers of data interpretation, such as role-playing data analysis or decision-making scenarios. Social implications Data storytelling has the potential to address current challenges in data literacy pedagogy and in teaching critical data literacy. Originality/value Course descriptions provide a template for future data literacy pedagogy involving data storytelling, and findings suggest implications for expanding definitions and applications of personal and folk data literacies.

  • Why does the Milky Way have a metallicity floor?

    arXiv (Cornell University) · 2024-06-12

    preprintOpen access

    The prevalence of light element enhancement in the most metal-poor stars is potentially an indication that the Milky Way has a metallicity floor for star formation around $\sim$10$^{-3.5}$ Z$_{\odot}$. We propose that this metallicity floor has its origins in metal-enriched star formation in the minihalos present during the Galaxy's initial formation. To arrive at this conclusion, we analyze a cosmological radiation hydrodynamics simulation that follows the concurrent evolution of multiple Population III star-forming minihalos. The main driver for the central gas within minihalos is the steady increase in hydrostatic pressure as the halos grow. We incorporate this insight into a hybrid one-zone model that switches between pressure-confined and modified free-fall modes to evolve the gas density with time according to the ratio of the free-fall and sound-crossing timescales. This model is able to accurately reproduce the density and chemo-thermal evolution of the gas in each of the simulated minihalos up to the point of runaway collapse. We then use this model to investigate how the gas responds to the absence of H$_{2}$. Without metals, the central gas becomes increasingly stable against collapse as it grows to the atomic cooling limit. When metals are present in the halo at a level of $\sim$10$^{-3.7}$ Z$_{\odot}$, however, the gas is able to achieve gravitational instability while still in the minihalo regime. Thus, we conclude that the Galaxy's metallicity floor is set by the balance within minihalos of gas-phase metal cooling and the radiation background associated with its early formation environment.

  • Utilizing Adversarial Examples for Bias Mitigation and Accuracy Enhancement

    arXiv (Cornell University) · 2024-04-18

    preprintOpen accessSenior author

    We propose a novel approach to mitigate biases in computer vision models by utilizing counterfactual generation and fine-tuning. While counterfactuals have been used to analyze and address biases in DNN models, the counterfactuals themselves are often generated from biased generative models, which can introduce additional biases or spurious correlations. To address this issue, we propose using adversarial images, that is images that deceive a deep neural network but not humans, as counterfactuals for fair model training. Our approach leverages a curriculum learning framework combined with a fine-grained adversarial loss to fine-tune the model using adversarial examples. By incorporating adversarial images into the training data, we aim to prevent biases from propagating through the pipeline. We validate our approach through both qualitative and quantitative assessments, demonstrating improved bias mitigation and accuracy compared to existing methods. Qualitatively, our results indicate that post-training, the decisions made by the model are less dependent on the sensitive attribute and our model better disentangles the relationship between sensitive attributes and classification variables.

  • CAVLI - Using image associations to produce local concept-based explanations

    2023-06-01 · 4 citations

    articleSenior author

    While explainability is becoming increasingly crucial in computer vision and machine learning, producing explanations that can link decisions made by deep neural networks to concepts that are easily understood by humans still remains a challenge. To address this challenge, we propose a framework that produces local concept-based explanations for the classification decisions made by a deep neural network. Our framework is based on the intuition that if there is a high overlap between the regions of the image that are associated with a human-defined concept and regions of the image that are useful for decision-making, then the decision is highly dependent on the concept. Our proposed CAVLI framework combines a global approach (TCAV) with a local approach (LIME). To test the effectiveness of the approach, we conducted experiments on both the ImageNet and CelebA datasets. These experiments validate the ability of our framework to quantify the dependence of individual decisions on predefined concepts. By providing local concept-based explanations, our framework has the potential to improve the transparency and interpretability of deep neural networks in a variety of applications.

Recent grants

Frequent coauthors

  • Tom Abel

    412 shared
  • John Wise

    392 shared
  • Britton Smith

    348 shared
  • G. Desvignes

    133 shared
  • Ue‐Li Pen

    99 shared
  • Shiro Ikeda

    The University of Tokyo

    94 shared
  • Jonathan Weintroub

    Center for Astrophysics Harvard & Smithsonian

    94 shared
  • Jordy Davelaar

    79 shared

Labs

Awards & honors

  • John Simon Guggenheim Fellowship in Visual Arts (2016)
  • Making Visible the Invisible (permanent installation at Seat…
  • Creative Capital Foundation support
  • Daniel Langlois Foundation for the Arts, Science and Technol…
  • Canada Council for the Arts support
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Matthew Turk

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup