Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…

Louis-Marie Jean Fabrice Bobay

· Asst ProfessorVerified

North Carolina State University · Plant and Microbial Biology

Active 2011–2026

h-index33
Citations4.2k
Papers6623 last 5y
Funding$1.5M1 active
See your match with Louis-Marie Jean Fabrice Bobay — sign in to PhdFit.Sign in

Research topics

  • Evolutionary biology
  • Biology
  • Genetics
  • Computational biology
  • Ecology
  • Computer Science
  • Artificial Intelligence
  • Astrobiology
  • Virology
  • Programming language

Selected publications

  • DEX: a consensus-based amino acid exchangeability measure for improved codon substitution modelling

    bioRxiv (Cold Spring Harbor Laboratory) · 2026-03-12

    articleOpen accessSenior author

    Abstract Physicochemically similar amino acids undergo more frequent substitutions compared to dissimilar amino acid pairs. Despite their clear potential, amino acid similarity matrices remain underused in molecular evolution, partially due to the high number of proposed amino acid distance measures and the lack of agreement on which are most accurate. In this study, we assessed the performance of 30 amino acid distance measures, including a new amino acid distance measure we developed based on recent deep mutational scanning data. We compared these measures across codon substitution models fit to alignments spanning Streptococcus , Drosophila , and mammalian lineages, as well as segregating variants across Escherichia coli strains and human genotypes. We further constructed consensus measures from combinations of top-performing measures in this analysis using the DISTATIS approach and retested these matrices. Our results show that experimentally-derived measures, particularly our new measure and the existing experimental exchangeability (EX) measure, best fit codon substitution patterns across diverse lineages. We found that a consensus measure based on these two approaches, which we named DEX, performed best overall. In addition, although site-specific variant effect predictors are intended to identify deleterious mutations, the representative tools we tested did not outperform amino acid distance measures for predicting mean substitution frequencies. They were however substantially more informative for identifying individual highly deleterious mutations. Overall, we provide a systematic comparison of the performance of existing measures, and we introduce an improved general-purpose amino acid distance measure for molecular evolution models. Significance Protein-coding genes have long been a focus for researchers studying the strength and direction of selection. By studying non-synonymous substitutions, those that change amino acids, it is possible to estimate the relative strength of selection. Despite widespread interest in such approaches, information on which amino acids are exchanged is underused in molecular evolution models. This is partly because many different measures exist for quantifying amino acid distances, particularly those based on physicochemical properties. A newer class of amino acid distance measures is derived from deep mutational scanning datasets, where virtually every possible substitution is tested for its impact on protein function. We characterised and compared 30 amino acid distance measures, including a novel measure based on deep mutational scanning data. We highlight differences in how well these measures fit real substitution and polymorphism datasets. Overall, we find that DEX, which is a consensus of our new measure and an existing experimental exchangeability measure, represents the best available amino acid distance measure to incorporate into molecular evolution models.

  • Investigating amino acid distance measures in molecular evolution models, and proposing a new consensus measure

    Open MIND · 2026-03-09

    datasetSenior author

    Key datafiles for the manuscript entitled "DEX: a consensus-based amino acid exchangeability measure for improved codon substitution modelling" by Gavin Douglas and Louis-Marie Bobay. The key table most readers will be interested in is "all_distance_measures_symmetric.tsv.gz". This is tab-delimited table with the pairwise amino acid distances based on all the measures we evaluated. Each row corresponds to a different amino acid pair, but note that the distances are symmetric for all measures (i.e., those with asymmetric distances between amino acids were averaged to be symmetric). The final columns in this table that indicate combined measures with "+" are non-focal DISTATIS consensus measures. These subdirectories are in the compressed folder called "workflow_files", and contain the key files for running our analyses: aa_metrics - Working files for processing and analyzing AA distance/similarity measures. Note that those interested in the final measures should use "all_distance_measures_symmetric.tsv.gz" allele_freq_vs_predicted_effects - Key files used for analyzing segregating non-synonymous polymorphisms. PAML_workflow - Files for fitting codon substitution models with PAML proteinGym - Files from the proteinGym database used for producing the custom DMS-EX measure.

  • <i>MetaStrainer</i> : Accurate reconstruction of bacterial strain genotypes from short-read metagenomic samples

    bioRxiv (Cold Spring Harbor Laboratory) · 2026-03-03

    articleOpen accessSenior authorCorresponding

    Abstract Summary Metagenomics provides broad insights from microbial communities, but more biological relevant phenotypes are attributed to subtle changes at the strain-level rather than species. Despite development of several tools using different algorithms, resolving individual strains from short-read pair-end sequencing data remains challenging. We developed MetaStrainer , a tool capable of reconstructing strain genotypes from metagenomic data. Compared with existing approaches, MetaStrainer substantially increases genotype accuracy, correctly identifies the number of strains, and accurately estimates their relative abundances. Accuracy of reconstructed genotypes is robust to choice of mapping reference. Availability and implementation MetaStrainer is implemented in Python 3. Source code and instructions are available on GitHub at https://www.github.com/lbobay/MetaStrainer and on Zenodo: https://doi.org/10.5281/zenodo.17872331 Contact ljbobay@ncsu.edu Supplementary Information Supplementary data is available at Bioinformatics online.

  • Investigating amino acid distance measures in molecular evolution models, and proposing a new consensus measure

    Zenodo (CERN European Organization for Nuclear Research) · 2026-03-09

    datasetOpen accessSenior author

    Key datafiles for the manuscript entitled "DEX: a consensus-based amino acid exchangeability measure for improved codon substitution modelling" by Gavin Douglas and Louis-Marie Bobay. The key table most readers will be interested in is "all_distance_measures_symmetric.tsv.gz". This is tab-delimited table with the pairwise amino acid distances based on all the measures we evaluated. Each row corresponds to a different amino acid pair, but note that the distances are symmetric for all measures (i.e., those with asymmetric distances between amino acids were averaged to be symmetric). The final columns in this table that indicate combined measures with "+" are non-focal DISTATIS consensus measures. These subdirectories are in the compressed folder called "workflow_files", and contain the key files for running our analyses: aa_metrics - Working files for processing and analyzing AA distance/similarity measures. Note that those interested in the final measures should use "all_distance_measures_symmetric.tsv.gz" allele_freq_vs_predicted_effects - Key files used for analyzing segregating non-synonymous polymorphisms. PAML_workflow - Files for fitting codon substitution models with PAML proteinGym - Files from the proteinGym database used for producing the custom DMS-EX measure.

  • Prevalence and Evolutionary Implications of Genome Rearrangements in Bacteria

    Genome Biology and Evolution · 2026-01-14 · 1 citations

    articleOpen accessSenior author

    The genetic material of bacteria and archaea is organized into various structures and setups, attesting that genome architecture is dynamic in these organisms. However, strong selective pressures are also acting to preserve genome organization, and it remains unclear how frequently genomes experience rearrangements and what mechanisms lead to these processes. Here, we assessed the dynamics and the drivers of genomic rearrangements across 121 microbial species. We show that synteny is highly conserved within most species, although several species present exceptionally flexible genomic layouts. Our results show that genomic rearrangements occur at a variable pace across bacteria and archaea, pointing to different selective constraints driving the accumulation of genomic changes across species. Importantly, we found that not only inversions but also translocations are highly enriched near the origin of replication (Ori), which suggests that many rearrangements may confer an adaptive advantage to the cell through the relocation of genes that benefit from gene dosage effects. Finally, our results confirm the view that mobile genetic elements-in particular transposable elements-are the main drivers of genomic translocations and inversions. Overall, our study shows that microbial species present largely stable genomic layouts and identifies key patterns and drivers of genome rearrangements in prokaryotes.

  • Co-occurrence is associated with horizontal gene transfer across marine bacteria independent of phylogeny

    bioRxiv (Cold Spring Harbor Laboratory) · 2025-03-28

    preprintOpen access

    Abstract Understanding the drivers and consequences of horizontal gene transfer (HGT) is a key goal of microbial evolution research. Although co-occurring taxa have long been appreciated to undergo HGT more often, this association is confounded with other factors, most notably their phylogenetic relatedness. To disentangle these factors, we analyzed 15,339 marine prokaryotic genomes (mainly bacteria) and their distribution in the global ocean. We identified HGT events across these genomes and enrichments for functions previously shown to be prone to HGT. By mapping metagenomic reads from 1,862 ocean samples to these genomes, we also identified co-occurrence patterns and environmental associations. Although we observed an expected negative association between HGT rates and phylogenetic distance, we only detected an association between co-occurrence and phylogenetic distance for closely related taxa. This observation refines the previously reported trend to closely related taxa, rather than a consistent pattern across all taxonomic levels, at least here within marine environments. In addition, we identified a significant association between co-occurrence and HGT, which remains even after controlling for phylogenetic distance and measured environmental variables. In a subset of samples with extended environmental data, we identified higher HGT levels associated with particle-attached bacteria and associations of varying directions with specific environmental variables, such as chlorophyll a and photosynthetically available radiation. Overall, our findings demonstrate the significant influence of ecological associations in shaping marine bacterial evolution through HGT.

  • Applying the Classic Test dN/dS to Detect Selection in Archaea

    Methods in molecular biology · 2025-01-01 · 1 citations

    articleSenior author
  • Introgression impacts the evolution of bacteria, but species borders are rarely fuzzy

    Nature Communications · 2025-11-13 · 1 citations

    articleOpen accessSenior author

    Most bacteria engage in gene flow through homologous recombination, and this mechanism may play a crucial role in maintaining species cohesiveness, much like sexual reproduction does in eukaryotes. However, introgression has been reported in bacteria and is associated with fuzzy species borders in some lineages, but its prevalence and impact on the delimitation of bacterial species have not been systematically characterized. Here, we use the term “introgression” to describe gene flow between the genomic backbone of distinct species (i.e., their core genomes)—an analogy to the classical usage in sexual organisms, but distinct in mechanism. We quantified the patterns of introgression across 50 major bacterial lineages. Our results reveal that bacteria present various levels of introgression, with an average of 2% of introgressed core genes and up to 14% in Escherichia–Shigella. Furthermore, our results show that some species are more prone to introgression than others within the same genus, and introgression is most frequent between highly related species. We found evidence that the various levels of introgression across lineages are likely associated with sequence relatedness, but the impact of ecology on this process was less clear. Introgression can occasionally lead to fuzzy species borders, although many of these cases are likely instances of ongoing speciation. Overall, our results indicate that introgression has substantially shaped the evolution and the diversification of bacteria, but this process does not substantially blur species borders. It is commonly thought that bacterial species borders tend to be fuzzy, due to frequent exchange of DNA. Here, Diop et al. quantify the patterns of gene flow between core genomes across 50 major bacterial lineages, showing that defining species using a framework inspired by the Biological Species Concept allows to identify clear species borders in most lineages.

  • Co-occurrence is associated with horizontal gene transfer across marine bacteria independent of phylogeny

    The ISME Journal · 2025-12-10 · 1 citations

    articleOpen access

    Understanding the drivers and consequences of horizontal gene transfer (HGT) is a key goal of microbial evolution research. Although co-occurring taxa have long been appreciated to undergo HGT more often, this association is confounded with other factors, most notably their phylogenetic relatedness. To disentangle these factors, we analyzed 15 339 marine prokaryotic genomes (mainly bacteria) and their distribution in the global ocean. We identified HGT events across these genomes and enrichments for functions previously shown to be prone to HGT. By mapping metagenomic reads from 1862 ocean samples to these genomes, we also identified co-occurrence patterns and environmental associations. Although we observed an expected negative association between HGT rates and phylogenetic distance, we only detected an association between co-occurrence and phylogenetic distance for closely related taxa. This observation refines the previously reported trend to closely related taxa, rather than a consistent pattern across all taxonomic levels, at least here within marine environments. In addition, we identified a significant association between co-occurrence and HGT, which remains even after controlling for phylogenetic distance and measured environmental variables. In a subset of samples with extended environmental data, we identified higher HGT levels associated with particle-attached prokaryotes and associations of varying directions with specific environmental variables, such as chlorophyll a and photosynthetically available radiation. Overall, our findings demonstrate the significant influence of ecological associations in shaping marine prokaryotic evolution through HGT.

  • Introgression impacts the evolution of bacteria, but species borders are rarely fuzzy

    bioRxiv (Cold Spring Harbor Laboratory) · 2024-05-09

    preprintOpen accessSenior authorCorresponding

    Abstract Most bacteria engage in gene flow and that this may act as a force maintaining species cohesiveness like it does in sexual organisms. However, introgression (gene flow between the genomic backbone of distinct species) has been reported in bacteria and is associated with fuzzy species borders in some lineages, but its prevalence and impact on the delimitation of bacterial species has not been systematically characterized. Here, we quantified the patterns of introgression across 50 major bacterial lineages. Our results reveal that bacteria present various levels of introgression, with an average of 2% of introgressed core genes and up to 12% in Campylobacter . Furthermore, our results show that some species are more prone to introgression than others within the same genus and introgression is most frequent between highly related species. We found evidence that the various levels of introgression across lineages are likely related to ecological proximity between species. Introgression can occasionally lead to fuzzy species borders, although many of these cases are likely instances of ongoing speciation. Overall, our results indicate that introgression has substantially shaped the evolution and the diversification of bacteria, but this process does not substantially blur species borders.

Recent grants

Frequent coauthors

  • Marie Touchon

    Centre National de la Recherche Scientifique

    32 shared
  • Eduardo P. C. Rocha

    Université Paris Cité

    32 shared
  • Anne Chevallereau

    Inserm

    20 shared
  • Caroline M. Stott

    North Carolina State University

    15 shared
  • Florian Douam

    Boston University

    15 shared
  • Awa Diop

    North Carolina State University

    15 shared
  • François‐Loïc Cosset

    École Normale Supérieure de Lyon

    15 shared
  • Howard Ochman

    The University of Texas at Austin

    15 shared
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Louis-Marie Jean Fabrice Bobay

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup