Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Natalia de Leon

Natalia de Leon

· ProfessorVerified

University of Wisconsin-Madison · Plant and Agroecosystem Sciences

Active 1978–2025

h-index50
Citations10.3k
Papers17756 last 5y
Funding
See your match with Natalia de Leon — sign in to PhdFit.Sign in

About

Natalia de Leon is a researcher prominently involved in maize genetics and breeding, with a focus on quantitative trait dissection, genomic analyses, and the genetic basis of agronomic and compositional traits in maize. Her work includes the development and application of methods for identifying genomic traces of selection and the genetic analysis of traits relevant to maize silage yield, quality, and cell wall composition. She has contributed to studies on the genetic diversity of maize populations, the effects of artificial selection on seed size, and the genetic architecture underlying developmental timing and phenotypic variation in maize. De Leon's research also encompasses the evaluation of maize traits beneficial for bioenergy production, including cellulosic ethanol, and the genetic factors influencing maize endosperm vitreousness and hardness. Her collaborations with graduate students and postdoctoral researchers have resulted in significant advancements in understanding maize genomics, breeding strategies, and the integration of high-throughput sequencing technologies for genotyping and genomic selection. Through her work, de Leon has contributed to the improvement of maize germplasm and the elucidation of genetic mechanisms that support enhanced crop performance and biofeedstock quality.

Research topics

  • Computer Science
  • Biology
  • Genetics
  • Machine Learning
  • Artificial Intelligence
  • Ecology
  • Engineering
  • Data science
  • Pulp and paper industry
  • Organic chemistry
  • Agronomy
  • Environmental science
  • Waste management
  • Geography
  • Medicine
  • Chemistry
  • World Wide Web

Selected publications

  • Designing a nitrogen-efficient cold-tolerant maize for modern agricultural systems

    The Plant Cell · 2025-07-01 · 6 citations

    reviewOpen access

    Maize (Zea mays L.) is the world's most productive grain crop and a cornerstone of global food supply. However, in temperate agricultural systems, maize exhibits 2 key anomalies. First, as a tropical species, maize cannot be planted in the cold conditions of early spring when light and natural soil nitrogen are available, resulting in a shorter growing season and creating a seasonal mismatch between nitrogen accessibility and demand. Second, maize kernel protein is a major nitrogen sink, driving fertilizer demand because of the scale of cultivation. This inefficient mismatch stems from modern maize's uses and the modest nutritional value of storage proteins. To address these anomalies, we established the Circular Economy that Reimagines Corn Agriculture initiative. Our vision requires advances in 3 research areas: (ⅰ) developing cold and frost tolerance during germination and early growth to enable the use of spring nitrogen and light resources; (ⅱ) reducing nitrogen allocation to grain by reducing low-quality storage proteins and developing alternative nitrogen sinks; and (ⅲ) stabilizing soil nitrogen by enhancing biological nitrification inhibition. We present blueprints for a nitrogen-efficient, cold-tolerant maize designed to utilize the full growing season, enabling farmers in temperate regions to fully leverage maize's C4 photosynthesis, reduce fertilizer inputs, increase yields, and minimize environmental impact.

  • Comparing Conditional Diffusion Models for Synthesizing Contrast-Enhanced Breast MRI from Pre-Contrast Images

    ArXiv.org · 2025-08-19

    preprintOpen access

    Dynamic contrast-enhanced (DCE) MRI is essential for breast cancer diagnosis and treatment. However, its reliance on contrast agents introduces safety concerns, contraindications, increased cost, and workflow complexity. To this end, we present pre-contrast conditioned denoising diffusion probabilistic models to synthesize DCE-MRI, introducing, evaluating, and comparing a total of 22 generative model variants in both single-breast and full breast settings. Towards enhancing lesion fidelity, we introduce both tumor-aware loss functions and explicit tumor segmentation mask conditioning. Using a public multicenter dataset and comparing to respective pre-contrast baselines, we observe that subtraction image-based models consistently outperform post-contrast-based models across five complementary evaluation metrics. Apart from assessing the entire image, we also separately evaluate the region of interest, where both tumor-aware losses and segmentation mask inputs improve evaluation metrics. The latter notably enhance qualitative results capturing contrast uptake, albeit assuming access to tumor localization inputs that are not guaranteed to be available in screening settings. A reader study involving 2 radiologists and 4 MRI technologists confirms the high realism of the synthetic images, indicating an emerging clinical potential of generative contrast-enhancement. We share our codebase at https://github.com/sebastibar/conditional-diffusion-breast-MRI.

  • Comparing Conditional Diffusion Models for Synthesizing Contrast-Enhanced Breast MRI from Pre-contrast Images

    Lecture notes in computer science · 2025-09-21

    book-chapter
  • Mitigating NDVI saturation in imagery of dense and healthy vegetation

    ISPRS Journal of Photogrammetry and Remote Sensing · 2025-06-18 · 18 citations

    article
  • Uncrewed Aerial Vehicle (UAV)-Based High-Throughput Phenotyping of Maize Silage Yield and Nutritive Values Using Multi-Sensory Feature Fusion and Multi-Task Learning with Attention Mechanism

    Remote Sensing · 2025-11-06 · 1 citations

    articleOpen access

    Maize (Zea mays L.) silage’s forage quality significantly impacts dairy animal performance and the profitability of the livestock industry. Recently, using uncrewed aerial vehicles (UAVs) equipped with advanced sensors has become a research frontier in maize high-throughput phenotyping (HTP). However, extensive existing studies only consider a single sensor modality and models developed for estimating forage quality are single-task ones that fail to utilize the relatedness between each quality trait. To fill the research gap, we propose MUSTA, a MUlti-Sensory feature fusion model that utilizes MUlti-Task learning and the Attention mechanism to simultaneously estimate dry matter yield and multiple nutritive values for silage maize breeding hybrids in the field environment. Specifically, we conducted UAV flights over maize breeding sites and extracted multi-temporal optical- and LiDAR-based features from the UAV-deployed hyperspectral, RGB, and LiDAR sensors. Then, we constructed an attention-based feature fusion module, which included an attention convolutional layer and an attention bidirectional long short-term memory layer, to combine the multi-temporal features and discern the patterns within them. Subsequently, we employed multi-head attention mechanism to obtain comprehensive crop information. We trained MUSTA end-to-end and evaluated it on multiple quantitative metrics. Our results showed that it is capable of practical quality estimation results, as evidenced by the agreement between the estimated quality traits and the ground truth data, with weighted Kendall’s tau coefficients (τw) of 0.79 for dry matter yield, 0.74 for MILK2006, 0.68 for crude protein (CP), 0.42 for starch, 0.39 for neutral detergent fiber (NDF), and 0.51 for acid detergent fiber (ADF). Additionally, we implemented a retrieval-augmented method that enabled comparable prediction performance, even without certain costly features available. The comparison experiments showed that the proposed approach is effective in estimating maize silage yield and nutritional values, providing a digitized alternative to traditional field-based phenotyping.

  • Distributional Data Analysis Uncovers Hundreds of Novel and Heritable Phenomic Features from Temporal Cotton and Maize Drone Imagery

    bioRxiv (Cold Spring Harbor Laboratory) · 2025-09-07

    preprintOpen access

    Abstract Genomic and phenomic analyses suggest additional heritable phenomic features can improve modeling of important end traits like senescence or yield. Field phenotyping generally uses trait values averaged across individual experimental units (plants or numerous plants within plots), ignoring the full distributional pattern of collected measures. Images of plants or plots, as captured by drones (unoccupied aerial vehicles / UAVs / drones), can be viewed as individual distribution functions that capture biological information. This study introduces and validates distributional data analysis in two crops and experiment types – cotton ( Gossypium hirsutum L.) single plant vegetation index (VI) analysis and maize ( Zea mays L.) plot-level yield predictions. In both crops, the concept of within-day variance decomposition was demonstrated. In cotton, genotypes exerted significant influences on temporal quantile functions of VIs. Maize yield prediction using distributional data with elastic-net regression indicated improvements in yield prediction between 12.7%-21.6% with quantiles outside the conventionally used median responsible for added predictive power. A novel data visualization method for per-pixel heritability allowed distributional features to be explainable and interpretable. These results have implications for future plant phenomic studies, indicating that distributional data analysis applied across temporal imagery captures novel, heritable, and interpretable biological signal that is lost when working with conventional measures of central tendency such as mean or median summary values of experimental units. Significance Repeated aerial imaging of agricultural experiments produces image data sets that capture plant development in high spatial and temporal resolutions. Frequently, images are summarized by measures of central tendency, such as mean or median values. Here, functional data distributional methods were applied to cotton ( Gossypium hirsutum L.) and maize ( Zea mays L.) image data, capturing more information than standard approaches. Cotton genotypes significantly impacted distributional spectral data while in maize, distributional data enabled more accurate predictions of grain yield versus models trained with median data alone. Distributional data were more explainable by genetics, with novel data visualization techniques able to shine light on specific parts of plant imagery with high and low genetic variance.

  • Global genotype by environment prediction competition reveals that diverse modeling strategies can deliver satisfactory maize yield estimates

    Genetics · 2024-11-22 · 17 citations

    articleOpen accessSenior author

    Predicting phenotypes from a combination of genetic and environmental factors is a grand challenge of modern biology. Slight improvements in this area have the potential to save lives, improve food and fuel security, permit better care of the planet, and create other positive outcomes. In 2022 and 2023, the first open-to-the-public Genomes to Fields initiative Genotype by Environment prediction competition was held using a large dataset including genomic variation, phenotype and weather measurements, and field management notes gathered by the project over 9 years. The competition attracted registrants from around the world with representation from academic, government, industry, and nonprofit institutions as well as unaffiliated. These participants came from diverse disciplines, including plant science, animal science, breeding, statistics, computational biology, and others. Some participants had no formal genetics or plant-related training, and some were just beginning their graduate education. The teams applied varied methods and strategies, providing a wealth of modeling knowledge based on a common dataset. The winner's strategy involved 2 models combining machine learning and traditional breeding tools: 1 model emphasized environment using features extracted by random forest, ridge regression, and least squares, and 1 focused on genetics. Other high-performing teams' methods included quantitative genetics, machine learning/deep learning, mechanistic models, and model ensembles. The dataset factors used, such as genetics, weather, and management data, were also diverse, demonstrating that no single model or strategy is far superior to all others within the context of this competition.

  • Deep Learning-Based High-Throughput Phenotyping Of Maize ( <i>Zea mays</i> L.) Tasseling From Uas Imagery Across Environments

    bioRxiv (Cold Spring Harbor Laboratory) · 2024-06-27 · 2 citations

    preprintOpen accessSenior author

    A bstract Flowering time is a critical phenological trait in maize ( Zea mays L.) breeding programs. Traditional measurements for assessing flowering time involve semi-subjective and labor-intensive manual observation, limiting the scale and efficiency of genetics and breeding improvement. Leveraging unoccupied aerial system (UAS, also known as UAVs or drones) technology coupled with convolutional neural networks (CNNs) presents a promising approach for high-throughput phenotyping of tasseling in maize. Most CNN image analysis is overly complicated for simple tasks relevant to plant scientists. Here a methodology for extracting tasseling from RGB imagery using a CNN-based approach was applied to 220 hybrids and 30 test lines grown in eight diverse environments (Wisconsin and Texas, U.S.A.) then validated through an unrelated set of hybrids. Overall accuracies of .946, .911, .985, and .988 were obtained for classifying maize images with or without tassels from College Station, TX in 2020; College Station, TX in 2021; Arlington, WI in 2021; and Madison, WI in 2021 respectively. By employing deep learning techniques, larger volumes of phenotypic data can be processed enabling high-throughput phenotyping in breeding programs. Although large datasets are required to train CNN models, the proposed methodology prioritizes simplicity in computational architecture while maintaining effectiveness in identifying flowered maize across diverse genotypes and environments.

  • Impact of genotype × environment interaction and selection history on genomic prediction in maize (<i>Zea mays</i> L.)

    Crop Science · 2024-10-15 · 3 citations

    articleOpen access

    Abstract Breeders made remarkable progress in improving productivity and stability of cultivars. Breeding progress relies on selecting favorable alleles for performance and stability to produce productive varieties across diverse environments. In this study, we analyzed the Genomes to Fields Initiative 2018–2019 genotype by environment interaction (G × E) dataset, focusing on three populations of double haploid (DH) lines derived from crossing inbrexpired Plant Variety Protection (ex‐PVP) inbred line PHW65 with inbred lines PHN11, Mo44, and MoG. PHW65 is an Iodent/Lancaster‐type inbred; PHN11 is an Iodent type ex‐PVP line; Mo44 is a tropical‐derived inbred; and MoG is an agronomically poor line derived from the variety Mastadon. Hybrids were produced by crossing the resulting DHs with Stiff Stalk testers PHT69 and LH195. The study's objective was to determine the donor inbreds' relative value and understand the impact of selection history on genomic prediction. We conducted a two‐stage analysis to compare hybrid performance and G × E variance of the populations. G × E variance for yield was significantly lower in the PHW65 × PHN11 population relative to the PHW65 × MoG population. The reduced G × E variance of the PHN11 population led to increased indirect prediction accuracy (when training and testing data are drawn from the same population but different environments). In cross‐validation, the PHN11 population had the greatest indirect prediction accuracy 45% of the time, followed by the Mo44 population (30%) and the MoG population (25%). Results demonstrate that prediction accuracy was greater in the population with the longest history of selection for favorable alleles (PHN11), contributing to greater yield stability.

  • Global Genotype by Environment Prediction Competition Reveals That Diverse Modeling Strategies Can Deliver Satisfactory Maize Yield Estimates

    bioRxiv (Cold Spring Harbor Laboratory) · 2024-09-20 · 3 citations

    preprintOpen accessSenior authorCorresponding

    Predicting phenotypes from a combination of genetic and environmental factors is a grand challenge of modern biology. Slight improvements in this area have the potential to save lives, improve food and fuel security, permit better care of the planet, and create other positive outcomes. In 2022 and 2023 the first open-to-the-public Genomes to Fields (G2F) initiative Genotype by Environment (GxE) prediction competition was held using a large dataset including genomic variation, phenotype and weather measurements and field management notes, gathered by the project over nine years. The competition attracted registrants from around the world with representation from academic, government, industry, and non-profit institutions as well as unaffiliated. These participants came from diverse disciplines include plant science, animal science, breeding, statistics, computational biology and others. Some participants had no formal genetics or plant-related training, and some were just beginning their graduate education. The teams applied varied methods and strategies, providing a wealth of modeling knowledge based on a common dataset. The winner's strategy involved two models combining machine learning and traditional breeding tools: one model emphasized environment using features extracted by Random Forest, Ridge Regression and Least-squares, and one focused on genetics. Other high-performing teams' methods included quantitative genetics, classical machine learning/deep learning, mechanistic models, and model ensembles. The dataset factors used, such as genetics; weather; and management data, were also diverse, demonstrating that no single model or strategy is far superior to all others within the context of this competition.

Frequent coauthors

Labs

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Natalia de Leon

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup