Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Daniel Apley

Daniel Apley

· Professor of Industrial Engineering and Management SciencesVerified

Northwestern University · Chemical Engineering

Active 1994–2026

h-index33
Citations6.2k
Papers19452 last 5y
Funding$2.0M
See your match with Daniel Apley — sign in to PhdFit.Sign in

About

Daniel Apley is a Professor of Industrial Engineering and Management Sciences at Northwestern University. His research interests include the statistical modeling and analysis of engineering, industrial, and enterprise systems; machine learning and predictive analytics; quality engineering and six sigma; and manufacturing process diagnosis and control. He holds a PhD in Mechanical Engineering, as well as MS degrees in Electrical and Mechanical Engineering, and a BS in Mechanical Engineering, all from the University of Michigan, Ann Arbor. His work involves developing frameworks for supervised and unsupervised segmentation and classification of materials microstructure images, optimizing hyperparameters of supervised learning algorithms, and creating interpretable neural network architectures for function visualization. He has received recognition for his contributions, including the Wilcoxon Prize for best practical application paper in Technometrics.

Research topics

  • Computer Science
  • Machine Learning
  • Artificial Intelligence
  • Mathematics
  • Chemical physics
  • Physics
  • Engineering
  • Statistics
  • Materials science
  • Quantum mechanics
  • Database
  • Mechanical engineering
  • Condensed matter physics
  • Mathematical optimization
  • Chemistry

Selected publications

  • SELDON: Supernova Explosions Learned by Deep ODE Networks

    arXiv (Cornell University) · 2026-03-04

    articleOpen access

    The discovery rate of optical transients will explode to 10 million public alerts per night once the Vera C. Rubin Observatory's Legacy Survey of Space and Time comes online, overwhelming the traditional physics-based inference pipelines. A continuous-time forecasting AI model is of interest because it can deliver millisecond-scale inference for thousands of objects per day, whereas legacy MCMC codes need hours per object. In this paper, we propose SELDON, a new continuous-time variational autoencoder for panels of sparse and irregularly time-sampled (gappy) astrophysical light curves that are nonstationary, heteroscedastic, and inherently dependent. SELDON combines a masked GRU-ODE encoder with a latent neural ODE propagator and an interpretable Gaussian-basis decoder. The encoder learns to summarize panels of imbalanced and correlated data even when only a handful of points are observed. The neural ODE then integrates this hidden state forward in continuous time, extrapolating to future unseen epochs. This extrapolated time series is further encoded by deep sets to a latent distribution that is decoded to a weighted sum of Gaussian basis functions, the parameters of which are physically meaningful. Such parameters (e.g., rise time, decay rate, peak flux) directly drive downstream prioritization of spectroscopic follow-up for astrophysical surveys. Beyond astronomy, the architecture of SELDON offers a generic recipe for interpretable and continuous-time sequence modeling in any time domain where data are multivariate, sparse, heteroscedastic, and irregularly spaced.

  • SELDON: Supernova Explosions Learned by Deep ODE Networks

    Open MIND · 2026-03-04

    preprint

    The discovery rate of optical transients will explode to 10 million public alerts per night once the Vera C. Rubin Observatory's Legacy Survey of Space and Time comes online, overwhelming the traditional physics-based inference pipelines. A continuous-time forecasting AI model is of interest because it can deliver millisecond-scale inference for thousands of objects per day, whereas legacy MCMC codes need hours per object. In this paper, we propose SELDON, a new continuous-time variational autoencoder for panels of sparse and irregularly time-sampled (gappy) astrophysical light curves that are nonstationary, heteroscedastic, and inherently dependent. SELDON combines a masked GRU-ODE encoder with a latent neural ODE propagator and an interpretable Gaussian-basis decoder. The encoder learns to summarize panels of imbalanced and correlated data even when only a handful of points are observed. The neural ODE then integrates this hidden state forward in continuous time, extrapolating to future unseen epochs. This extrapolated time series is further encoded by deep sets to a latent distribution that is decoded to a weighted sum of Gaussian basis functions, the parameters of which are physically meaningful. Such parameters (e.g., rise time, decay rate, peak flux) directly drive downstream prioritization of spectroscopic follow-up for astrophysical surveys. Beyond astronomy, the architecture of SELDON offers a generic recipe for interpretable and continuous-time sequence modeling in any time domain where data are multivariate, sparse, heteroscedastic, and irregularly spaced.

  • A Framework for Supervised and Unsupervised Segmentation and Classification of Materials Microstructure Images

    SSRN Electronic Journal · 2025-01-01 · 1 citations

    preprintOpen access
  • End-to-End Automated Segmentation Framework for Four-Dimensional Scanning Transmission Electron Microscopy Data

    Microscopy and Microanalysis · 2025-09-03

    articleSenior author

    Four-dimensional scanning transmission electron microscopy (4D-STEM) is powerful for rapidly characterizing arrays of nanoparticles produced via high-throughput synthesis. However, such 4D-STEM datasets typically contain thousands of nanoparticles, each characterized by thousands of diffraction patterns spatially distributed across the nanoparticle, necessitating efficient and comprehensive analysis. We propose an end-to-end segmentation framework to automatically segment each nanoparticle into regions with distinct composition/orientation of crystal grains, using only the 4D-STEM data. Bragg disk information is extracted in a physics-informed manner from the diffraction patterns at each spatial location and combined with the real space coordinates to form feature vectors. These feature vectors are then used as inputs to a Gaussian mixture model (GMM) to segment the nanoparticle into distinct regions. We also develop two visualization tools based on the GMM outputs to infer the interface transition and the degree of superposition. Our framework comprehensively integrates machine learning tools and physics knowledge, and provides a basis for substantially compressing enormous 4D-STEM datasets, e.g., by replacing the full 4D-STEM dataset for each nanoparticle with only a single set of Bragg disk features for each distinct crystal grain identified in the nanoparticle. We demonstrate the power of our framework by presenting results for real, complex datasets.

  • A Framework for Supervised and Unsupervised Segmentation and Classification of Materials Microstructure Images

    ArXiv.org · 2025-02-10

    preprintOpen accessSenior author

    Microstructure of materials is often characterized through image analysis to understand processing-structure-properties linkages. We propose a largely automated framework that integrates unsupervised and supervised learning methods to classify micrographs according to microstructure phase/class and, for multiphase microstructures, segments them into different homogeneous regions. With the advance of manufacturing and imaging techniques, the ultra-high resolution of imaging that reveals the complexity of microstructures and the rapidly increasing quantity of images (i.e., micrographs) enables and necessitates a more powerful and automated framework to extract materials characteristics and knowledge. The framework we propose can be used to gradually build a database of microstructure classes relevant to a particular process or group of materials, which can help in analyzing and discovering/identifying new materials. The framework has three steps: (1) segmentation of multiphase micrographs through a recently developed score-based method so that different microstructure homogeneous regions can be identified in an unsupervised manner; (2) {identification and classification of} homogeneous regions of micrographs through an uncertainty-aware supervised classification network trained using the segmented micrographs from Step $1$ with their identified labels verified via the built-in uncertainty quantification and minimal human inspection; (3) supervised segmentation (more powerful than the segmentation in Step $1$) of multiphase microstructures through a segmentation network trained with micrographs and the results from Steps $1$-$2$ using a form of data augmentation. This framework can iteratively characterize/segment new homogeneous or multiphase materials while expanding the database to enhance performance. The framework is demonstrated on various sets of materials and texture images.

  • Evaluating Acute Stroke Diagnosis Using Simulation Scenarios

    Annals of Emergency Medicine · 2025-04-08 · 1 citations

    articleOpen access
  • One-at-a-time knockoffs: controlled false discovery rate with higher power

    ArXiv.org · 2025-02-26

    preprintOpen accessSenior author

    We propose one-at-a-time knockoffs (OATK), a new methodology for detecting important explanatory variables in linear regression models while controlling the false discovery rate (FDR). For each explanatory variable, OATK generates a knockoff design matrix that preserves the Gram matrix by replacing one-at-a-time only the single corresponding column of the original design matrix. OATK is a substantial relaxation and simplification of the knockoff filter by Barber and Candès (BC), which simultaneously generates all columns of the knockoff design matrix to satisfy a much larger set of constraints. To test each variable's importance, statistics are then constructed by comparing the original vs. knockoff coefficients. Under a mild correlation assumption on the original design matrix, OATK asymptotically controls the FDR at any desired level. Moreover, OATK consistently achieves (often substantially) higher power than BC and other approaches across a variety of simulation examples and a real genetics dataset. Generating knockoffs one-at-a-time also has substantial computational advantages and facilitates additional enhancements, such as conditional calibration or derandomization, to further improve power and consistency of FDR control. OATK can be viewed as the conditional randomization test (CRT) generalized to fixed-design linear regression problems, and can generate fine-grained p-values for each hypothesis.

  • Emerging Microelectronic Materials by Design: Navigating Combinatorial Design Space with Scarce and Dispersed Data

    Accounts of Materials Research · 2025-05-05 · 3 citations

    article

    ConspectusThe increasing demands of sustainable energy, electronics, and biomedical applications call for next-generation functional materials with unprecedented properties. Of particular interest are emerging materials that display exceptional physical properties, making them promising candidates for energy-efficient microelectronic devices. As the conventional Edisonian approach becomes significantly outpaced by growing societal needs, emerging computational modeling and machine learning methods have been employed for the rational design of materials. However, the complex physical mechanisms, cost of first-principles calculations, and the dispersity and scarcity of data pose challenges to both physics-based and data-driven materials modeling. Moreover, the combinatorial composition–structure design space is high-dimensional and often disjoint, making design optimization nontrivial.In this Account, we review a team effort toward establishing a framework that integrates data-driven and physics-based methods to address these challenges and accelerate material design. We begin by presenting our integrated material design framework and its three components in a general context. (1) Using text mining and natural language processing techniques, our framework first extracts and organizes relevant information dispersed in the literature. (2) From this initial database of relevant materials, data-driven models can be trained and subsequently employed to perform virtual screening of the unknown materials space. This virtual screening process can identify promising materials families for further investigation, thus narrowing down the candidate space. (3) Within the identified materials families, a Bayesian optimization-based adaptive discovery workflow is applied to search for materials with optimal properties. To extend the capability of Bayesian optimization, which was previously restricted to small data and numerical variables, we developed a family of uncertainty-aware machine learning methods for mixed numerical and categorical variables.We then provide an example of applying this materials design framework to metal–insulator transition (MIT) materials, a specific type of emerging material with practical importance in next-generation memory technologies. We identify multiple new materials that may display this property in the lacunar spinel and Ruddlesden–Popper perovskite families and propose pathways for their synthesis. The classifiers used to identify new possible MIT materials also identified previously unknown features that may be used for predictive theory for this class of materials. For example, we have identified descriptors derived from ionicity and atom sizes as indicators to MIT behavior.Finally, we identify some outstanding challenges in data-driven materials design, such as material data quality issues, property–performance mismatch, and validation and deployment. We seek to raise awareness of these overlooked issues hindering material design, thus stimulating efforts toward developing methods to mitigate the gaps.

  • Measuring Variable Importance via Accumulated Local Effects

    arXiv (Cornell University) · 2025-12-24

    preprintOpen accessSenior author

    A shortcoming of black-box supervised learning models is their lack of interpretability or transparency. To facilitate interpretation, post-hoc global variable importance measures (VIMs) are widely used to assign to each predictor or input variable a numerical score that represents the extent to which that predictor impacts the fitted model's response predictions across the training data. It is well known that the most common existing VIMs, namely marginal Shapley and marginal permutation-based methods, can produce unreliable results if the predictors are highly correlated, because they require extrapolation of the response at predictor values that fall far outside the training data. Conditional versions of Shapley and permutation VIMs avoid or reduce the extrapolation but can substantially deflate the importance of correlated predictors. For the related goal of visualizing the effects of each predictor when strong predictor correlation is present, accumulated local effects (ALE) plots were recently introduced and have been widely adopted. This paper presents a new VIM approach based on ALE concepts that avoids both the extrapolation and the VIM deflation problems when predictors are correlated. We contrast, both theoretically and numerically, ALE VIMs with Shapley and permutation VIMs. Our results indicate that ALE VIMs produce similar variable importance rankings as Shapley and permutation VIMs when predictor correlations are mild and more reliable rankings when correlations are strong. An additional advantage is that ALE VIMs are far less computationally expensive.

  • Fractional Cross-Validation for Optimizing Hyperparameters of Supervised Learning Algorithms

    Technometrics · 2025-06-09 · 2 citations

    articleSenior authorCorresponding

Recent grants

Frequent coauthors

  • Wei Chen

    65 shared
  • Anh Tuan Bui

    Virginia Commonwealth University

    15 shared
  • Siyu Tao

    Southwest Jiaotong University

    12 shared
  • Fugee Tsung

    12 shared
  • Jianjun Shi

    Fudan University

    12 shared
  • Paul D. Arendt

    11 shared
  • Zhen Ming Jiang

    11 shared
  • Akshay Iyer

    Indian Institute of Technology Bombay

    10 shared

Awards & honors

  • Wilcoxon Prize for best practical application paper appearin…
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Daniel Apley

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup