Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…

Benjamin John Callahan

· Assoc ProfessorVerified

North Carolina State University · Plant and Microbial Biology

Active 2003–2026

h-index31
Citations38.2k
Papers7634 last 5y
Funding$1.9M1 active
See your match with Benjamin John Callahan — sign in to PhdFit.Sign in

About

Benjamin Callahan is an Associate Professor in the Department of Population Health and Pathobiology at NC State University and a member of the Bioinformatics Research Center. He joined NC State in January 2017 as part of the Chancellor’s Faculty Excellence Program cluster hire in Microbiomes and Complex Microbial Communities. His research focuses on microbiomes—the complex microbial communities that inhabit and interact with nearly every part of the environment. Callahan has developed new computational methods to more accurately, precisely, and reproducibly characterize these communities, particularly through the use of metagenomic sequencing methods. His work aims to understand the microbial contribution to important health and environmental issues, including the pathophysiology of preterm birth, the development of antimicrobial resistance following antibiotic use, and the epidemiology of C. difficile. Additionally, he is interested in microbial adaptation, which he studies through analytics, simulation, and experimental evolution. Callahan holds a bachelor’s degree in physics and mathematics from Iowa State University and earned his Ph.D. in physics from the University of California, Santa Barbara, under the mentorship of Boris Shraiman. His postdoctoral training includes positions at Stanford University, where he worked with Daniel Fisher on adaptive evolution and with Susan Holmes on quantitative microbiome analysis.

Research topics

  • Biology
  • Computer Science
  • Ecology
  • Genetics
  • Computational biology
  • Microbiology
  • Bioinformatics
  • Evolutionary biology
  • Medicine
  • Internal medicine
  • Computer vision
  • Physiology
  • Pharmacology
  • Geography
  • Veterinary medicine
  • Archaeology
  • Database
  • Nanotechnology

Selected publications

  • Identify contaminants with decontam on the QIIME 2 Framework

    Microbiology Resource Announcements · 2026-04-27

    articleOpen accessSenior author

    Here, we present the integration of the decontam method for contaminant identification and a supplemental approach for identifying the source of contaminants in sequencing data within the QIIME 2 Framework for microbiome data science. We demonstrate its use in a tutorial based on the QIIME 2 "Moving Pictures Tutorial" data.

  • Pathogenic bacterial species and the microbiome of cat fleas (Ctenocephalides felis) inhabiting flea-infested homes

    PLoS ONE · 2026-01-30

    articleOpen accessSenior author

    BACKGROUND: Ctenocephalides felis is a common ectoparasite of dogs and cats and can transmit a variety of pathogens including Bartonella and Rickettsia species. These bacteria, along with the known endosymbiont Wolbachia, are well-documented members of the C. felis microbiome, but species-level information is limited. Additionally, little is known about the variation in the C. felis microbiome in fleas from different sources and when different sequencing methods are applied to the same samples. OBJECTIVE: This study aimed to characterize the flea microbiome using both short-read (V3/V4) and long-read (full-length) 16S rRNA gene sequencing, determine whether long-read sequencing improves species-level identification especially in known pathogenic genera, and evaluate differences in microbial composition between fleas collected from cats, dogs, and environmental traps. METHODS: Fleas were collected from cats, dogs, and traps in flea-infested homes in Florida, pooled by source, and sequenced using short- (V3/V4) and long-read (full-length) 16S rRNA gene sequencing. Microbial prevalence and abundance were compared across sequencing approaches. Community composition was evaluated for differences between sources and houses. Candidate members of the flea microbiome were identified based on a combination of prevalence, abundance, and statistical signatures of potential contaminant origin. For Rickettsia and Bartonella, species-level taxonomic assignments were refined using a phylogenetic approach. RESULTS: Wolbachia, Rickettsia, and Bartonella were the most prevalent and abundant taxa. Spiroplasma was identified as a fourth core member of the flea microbiome. Long-read sequencing enabled better, but not perfect, species-level classification of Bartonella and Rickettsia compared to short-read sequencing. Important relationships between specific ASVs and flea sources were identified, for example fleas from cats harbored higher abundances of B. clarridgeiae and B. henselae than fleas from traps.

  • Generating, curating, and evaluating <i>trnL</i> reference sequence databases: Benchmarking OBITools3/ecoPCR, RESCRIPt, and MetaCurator

    bioRxiv (Cold Spring Harbor Laboratory) · 2026-04-10

    articleOpen accessSenior author

    Abstract Plant DNA metabarcoding enables the identification of plant taxa in mixed samples, with the trnL (UAA) intron and its P6 loop mini-barcode region performing as well as or better than other commonly used markers. Reliable metabarcoding requires high-quality reference databases, yet a regularly maintained trnL resource is currently lacking. Consequently, most studies use uncurated sequences downloaded directly from public repositories without essential validation. We address these gaps by providing guidance through a systematic comparison of three database curation tools - OBITools3/ecoPCR, RESCRIPt, and MetaCurator - to generate three trnL reference sequence databases and evaluate their classification performance across commonly sequenced trnL regions (CD, CH, and GH). Reference trnL sequences and taxonomy files were retrieved from public sequence repositories and curated using standardized filtering steps to reduce taxonomic errors, sequence ambiguity, and redundancy. Four simulated query datasets—two base sets and their mutated counterparts—were constructed to assess classification performance of the databases using the Naïve Bayesian Classifier implemented in DADA2.- The evaluation showed that performance differed by trnL region: MetaCurator and RESCRIPt yielded higher and similar metrics for trnL CD; OBITools3/ecoPCR and RESCRIPt were comparable for trnL CH; and MetaCurator attained the highest performance for trnL GH region. All reference databases, taxonomy, and evaluation files are available at Zenodo ( https://doi.org/10.5281/zenodo.17969450 ). The complete computational workflow and scripts are available on GitHub ( https://github.com/oskuddar/trnL_DB ). Although evaluation was focused on plant taxa in the United States, the resulting databases are suitable for use as global trnL reference databases.

  • Assessing contamination in DNA extraction kits commonly used for microbiome research

    bioRxiv (Cold Spring Harbor Laboratory) · 2025-09-23

    preprintOpen accessSenior authorCorresponding

    Abstract Sequencing-based measurements are now routinely used to investigate the microbial world, however, contamination by DNA outside the intended sample remains a problem. Contaminants obscure the true microbial signal and can lead to misleading scientific interpretations. Much work has been done to address the effects of these contaminants including best practices as outlined in Eisenhofer et al . and Fierer et al . Yet, even with best practices in place, the current literature consensus is that contaminants remain impactful, at least in low biomass environments (5, 7, 11, 13, 16). One well-known source of contaminants are those found within DNA extraction kits, as was shown clearly in the pioneering work of Salter et al. 2012 and Karstens et al. 2019 . However, given the rapid evolution of DNA sequencing methods, it would be worthwhile to revisit the issue of contaminants in contemporary DNA extraction kits (the “kitome”). Here we provide an updated characterization of the ‘kitomes’ of DNA extraction kits commonly used for microbiome research. Importance Microbial contamination in commonly used DNA extraction kits has not been recently assessed. Here we evaluate the contamination in DNA extraction kits commonly used in microbiome studies over the past several years, and provide actionable guidance on appropriate DNA extraction kits for low biomass microbiome measurements.

  • Guidelines for preventing and reporting contamination in low-biomass microbiome studies

    Nature Microbiology · 2025-06-20 · 67 citations

    reviewOpen access
  • Practical and cost-effective method for the isolation of pollen grains from various sources

    Acta Palaeobotanica · 2025-06-01

    articleOpen access

    Mock standards, with known concentrations and varied characteristics, when analyzed alongside unknown samples, can provide evaluation, optimization, and validation of scientific methods. Due to the scarcity of commercially available pollen grains, this study introduces a practical and cost-effective method for isolating pollen grains from various sources to be used in a mock pollen standard. Our method was tested using 25 diverse species derived from different sources, including herbarium materials (n, 20; dated from 1941 to 2006), commercially sourced (n, 2), and fresh hand-collected (n, 3), representing a wide range of taxonomic diversity and pollen morphology. Isolation with vacuum filtration, which can be completed in a basic laboratory, easily removes inorganic and organic debris while avoiding lysis of the pollen grains. This paper details the key steps in this method, including a) collecting suitable plant materials containing pollen grains from fresh and herbarium specimens and b) isolating, quantifying and storing the pollen grains. This approach is particularly beneficial for researchers in palynology, plant biology, forensic science and environmental monitoring, offering a practical way to isolate pollen grains for inclusion as a mock standard while preserving both morphological features and genetic material.

  • Pseudo-pac site sequences used by phage P22 in generalized transduction of Salmonella

    PLoS Pathogens · 2024-06-24 · 2 citations

    articleOpen accessCorresponding

    Salmonella enterica Serovar Typhimurium (Salmonella) and its bacteriophage P22 are a model system for the study of horizontal gene transfer by generalized transduction. Typically, the P22 DNA packaging machinery initiates packaging when a short sequence of DNA, known as the pac site, is recognized on the P22 genome. However, sequences similar to the pac site in the host genome, called pseudo-pac sites, lead to erroneous packaging and subsequent generalized transduction of Salmonella DNA. While the general genomic locations of the Salmonella pseudo-pac sites are known, the sequences themselves have not been determined. We used visualization of P22 sequencing reads mapped to host Salmonella genomes to define regions of generalized transduction initiation and the likely locations of pseudo-pac sites. We searched each genome region for the sequence with the highest similarity to the P22 pac site and aligned the resulting sequences. We built a regular expression (sequence match pattern) from the alignment and used it to search the genomes of two P22-susceptible Salmonella strains-LT2 and 14028S-for sequence matches. The final regular expression successfully identified pseudo-pac sites in both LT2 and 14028S that correspond with generalized transduction initiation sites in mapped read coverages. The pseudo-pac site sequences identified in this study can be used to predict locations of generalized transduction in other P22-susceptible hosts or to initiate generalized transduction at specific locations in P22-susceptible hosts with genetic engineering. Furthermore, the bioinformatics approach used to identify the Salmonella pseudo-pac sites in this study could be applied to other phage-host systems.

  • Impact of florfenicol dosing regimen on the phenotypic and genotypic resistance of enteric bacteria in steers

    Scientific Reports · 2024-02-28 · 4 citations

    articleOpen access

    The food animal sector's use of antimicrobials is heavily critiqued for its role in allowing resistance to develop against critically important antimicrobials in human health. The WHO recommends using lower tier antimicrobials such as florfenicol for disease treatment. The primary objective of this study was to assess the differences in resistance profiles of enteric microbes following administration of florfenicol to steers using both FDA-approved dosing regimens and two different detection methods. Our hypothesis was that we would identify an increased prevalence of resistance in the steers administered the repeated, lower dose of florfenicol; additionally, we hypothesized resistance profiles would be similar between both detection methods. Twelve steers were administered either two intramuscular (20 mg/kg q 48 h; n = 6) or a single subcutaneous dose (40 mg/kg, n = 6). Fecal samples were collected for 38 days, and E. coli and Enterococcus were isolated and tested for resistance. Fecal samples were submitted for metagenomic sequencing analysis. Metagenomics revealed genes conferring resistance to aminoglycosides as the most abundant drug class. Most multidrug resistance genes contained phenicols. The genotypic and phenotypic patterns of resistance were not similar between drug classes. Observed increases in resistant isolates and relative abundance of resistance genes peaked after drug administration and returned to baseline by the end of the sampling period. The use of a "lower tier" antimicrobial, such as florfenicol, may cause an increased amount of resistance to critically important antimicrobials for a brief period, but these changes largely resolve by the end of the drug withdrawal period.

  • Serovar-level identification of bacterial foodborne pathogens from full-length 16S rRNA gene sequencing

    mSystems · 2024-02-06 · 10 citations

    articleOpen accessSenior author

    ABSTRACT The resolution of variation within species is critical for interpreting and acting on many microbial measurements. In the key foodborne pathogens Salmonella and Escherichia coli , the primary subspecies classification scheme used is serotyping: differentiating variants within these species by surface antigen profiles. Serotype prediction from whole-genome sequencing (WGS) of isolates is now seen as comparable or preferable to traditional laboratory methods where WGS is available. However, laboratory and WGS methods depend on an isolation step that is time-consuming and incompletely represents the sample when multiple strains are present. Community sequencing approaches that skip the isolation step are, therefore, of interest for pathogen surveillance. Here, we evaluated the viability of amplicon sequencing of the full-length 16S rRNA gene for serotyping Salmonella enterica and E. coli . We developed a novel algorithm for serotype prediction, implemented as an R package (Seroplacer), which takes as input full-length 16S rRNA gene sequences and outputs serovar predictions after phylogenetic placement into a reference phylogeny. We achieved over 89% accuracy in predicting Salmonella serotypes on in silico test data and identified key pathogenic serovars of Salmonella and E. coli in isolate and environmental test samples. Although serotype prediction from 16S rRNA gene sequences is not as accurate as serotype prediction from WGS of isolates, the potential to identify dangerous serovars directly from amplicon sequencing of environmental samples is intriguing for pathogen surveillance. The capabilities developed here are also broadly relevant to other applications where intraspecies variation and direct sequencing from environmental samples could be valuable. IMPORTANCE In order to prevent and stop outbreaks of foodborne pathogens, it is important that we can detect when pathogenic bacteria are present in a food or food-associated site and identify connections between specific pathogenic bacteria present in different samples. In this work, we develop a new computational technology that allows the important foodborne pathogens Escherichia coli and Salmonella enterica to be serotyped (a subspecies level classification) from sequencing of a single-marker gene, and the 16S rRNA gene often used to surveil bacterial communities. Our results suggest current limitations to serotyping from 16S rRNA gene sequencing alone but set the stage for further progress that we consider likely given the rapid advance in the long-read sequencing technologies and genomic databases our work leverages. If this research direction succeeds, it could enable better detection of foodborne pathogens before they reach the public and speed the resolution of foodborne pathogen outbreaks.

  • <i>Gardnerella</i> diversity and ecology in pregnancy and preterm birth

    mSystems · 2024-05-16 · 16 citations

    articleOpen accessSenior author

    ABSTRACT The vaginal microbiome has been linked to negative health outcomes including preterm birth. Specific taxa, including Gardnerella spp., have been identified as risk factors for these conditions. Historically, microbiome analysis methods have treated all Gardnerella spp. as one species, but the broad diversity of Gardnerella has become more apparent. We explore the diversity of Gardnerella clades and genomic species in the vaginal microbiome of pregnant women and their associations with microbiome composition and preterm birth. Relative abundance of Gardnerella clades and genomic species and other taxa was quantified in shotgun metagenomic sequencing data from three distinct cohorts of pregnant women. We also assessed the diversity and abundance of Gardnerella variants in 16S rRNA gene amplicon sequencing data from seven previously conducted studies in differing populations. Individual microbiomes often contained multiple Gardnerella variants, and the number of clades was associated with increased microbial load, or the ratio of non-human reads to human reads. Taxon co-occurrence patterns were largely consistent across Gardnerella clades and among cohorts. Some variants previously described as rare were prevalent in other cohorts, highlighting the importance of surveying a diverse set of populations to fully capture the diversity of Gardnerella . The diversity of Gardnerella both across populations and within individual vaginal microbiomes has long been unappreciated, as has been the intra-species diversity of many other members of the vaginal microbiome. The broad genomic diversity of Gardnerella has led to its reclassification as multiple species; here we demonstrate the diversity of Gardnerella found within and between vaginal microbiomes. IMPORTANCE The present study shows that single microbiomes can contain all currently known species of Gardnerella and that multiple similar species can exist within the same environment. Furthermore, surveys of demographically distinct populations suggest that some species appear more commonly in certain populations. Further studies in broad and diverse populations will be necessary to fully understand the ecological roles of each Gardnerella sp., how they can co-exist, and their distinct impacts on microbial communities, preterm birth, and other health outcomes.

Recent grants

Frequent coauthors

Education

  • Ph.D., Physics

    University of California, Santa Barbara

    2009
  • Bachelors of Science, Physics, Math

    Iowa State University

    2002
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Benjamin John Callahan

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup