
Todd Treangen
· Assistant Professor of Computer ScienceVerifiedRice University · Computer Science
Active 2004–2026
About
Todd Treangen is an Associate Professor of Computer Science at Rice University and a member of the Ken Kennedy Institute. He received his BSc in Computer Science from the University of Nebraska, and both MSc and Ph.D. degrees in Computer Science from the Polytechnic University of Catalonia in Spain, where he was awarded the Europeus Doctor Distinction, the highest honor. He joined Rice University as an Assistant Professor in July 2018. Prior to his current position, he was a Research Scientist at the Center for Bioinformatics and Computational Biology at the University of Maryland College Park, and a Principal Investigator at the National Biodefense Analysis and Countermeasures Center. His primary research interests lie at the intersection of computer science and genomics, focusing on developing novel computational methods and software tools relevant to real-time monitoring of microbial community dynamics, infectious diseases, and biothreats. His research group aims to address computational challenges posed by the metagenomic data deluge, especially in contexts such as tracking pandemics and synthetic DNA screening, by designing efficient algorithms, heuristics, and data structures. The lab is also dedicated to disseminating open-source bioinformatics methods and providing research opportunities for Rice undergraduates. Todd Treangen was awarded the NSF CAREER Award in 2023.
Research topics
- Computer Science
- Biology
- Computational biology
- Artificial Intelligence
- Genetics
- Internal medicine
- Virology
- Medicine
- Bioinformatics
- Data Mining
- Sociology
- Immunology
- Environmental science
- Data science
- Demography
- Physics
- Environmental engineering
- Physiology
- Telecommunications
Selected publications
arXiv (Cornell University) · 2026-03-25
preprintOpen accessWe study the problem of building space-efficient, in-memory indexes for massive key-value datasets with highly skewed value distributions. This challenge arises in many data-intensive domains and is particularly acute in computational genomics, where $k$-mer count tables can contain billions of entries dominated by a single frequent value. While recent work has proposed to address this problem by augmenting compressed static functions (CSFs) with pre-filters, existing approaches rely on complex heuristics and lack formal guarantees. In this paper, we introduce a principled algorithm, called AutoCSF, for combining CSFs with pre-filtering to provably handle skewed distributions with near-optimal space usage. We improve upon prior CSF pre-filtering constructions by (1) deriving a mathematically rigorous decision criterion for when filter augmentation is beneficial; (2) presenting a general algorithmic framework for integrating CSFs with modern set membership data structures beyond the classic Bloom filter; and (3) establishing theoretical guarantees on the overall space usage of the resulting indexes. Our open-source implementation of AutoCSF demonstrates space savings over baseline methods while maintaining low query latency.
medRxiv · 2026-02-02
articleOpen accessAbstract Candida auris is a multidrug-resistant fungal pathogen that presents substantial challenges for healthcare facilities due to its high mortality rates among vulnerable populations. Six C. auris clades have been identified based on their susceptibility to antifungal treatment and environmental stressors. Identifying the circulating C. auris clade(s) is critical for understanding transmission and selecting a disease control strategy. To inform targeted implementation of community wastewater monitoring for C. auris , samples were collected over 34 weeks from 8 nursing homes and 6 downstream wastewater treatment plants (WWTPs). Detection rates and concentrations of C. auris DNA were significantly higher in samples from nursing homes compared to those from WWTPs. Amplicon sequencing methods were developed and applied to characterize the circulating C. auris clade in a nursing home wastewater sample. This study demonstrates the utility of wastewater monitoring as a resource-efficient approach for detecting and subtyping C. auris in vulnerable communities.
Antibiotic-induced gut microbiome remodeling reduces neuroinflammation in traumatic brain injury
Communications Biology · 2026-02-25 · 2 citations
articleOpen accessTraumatic brain injury induces neuroinflammation and gut microbiome dysbiosis, yet the effects of short-term antibiotic treatment on these processes remain poorly understood. To address this, male mice received controlled brain injuries followed by a brief course of oral antibiotics. Antibiotic treatment reduced bacterial abundance in feces and altered microbial diversity, with more pronounced shifts after two injuries. Despite this disruption, antibiotic-treated mice exhibited smaller lesion volumes, reduced cell death, attenuated microglial and macrophage activation, lower pro-inflammatory cytokine levels, and decreased astrogliosis and peripheral immune cell infiltration compared with vehicle-treated mice after two injuries. In the gut, increasing injury severity was associated with villus shortening and loss of mucus-producing cells, and antibiotic treatment further modified these injury-related changes. Circulating levels of short-chain fatty acids and associated microbial metabolic functions were reduced by antibiotic exposure. In contrast, germ-free mice showed increased lesion volumes and exacerbated gliosis following brain injury. Long-read metagenomic sequencing identified Parasutterella excrementihominis and Lactobacillus johnsonii as taxa that persisted despite antibiotic treatment. Collectively, these results suggest that antibiotics can reduce brain damage after injury through mechanisms not explained by short-chain fatty acids, while also highlighting potential drawbacks of altering the gut microbiome.
ArXiv.org · 2026-03-25
articleOpen accessWe study the problem of building space-efficient, in-memory indexes for massive key-value datasets with highly skewed value distributions. This challenge arises in many data-intensive domains and is particularly acute in computational genomics, where $k$-mer count tables can contain billions of entries dominated by a single frequent value. While recent work has proposed to address this problem by augmenting compressed static functions (CSFs) with pre-filters, existing approaches rely on complex heuristics and lack formal guarantees. In this paper, we introduce a principled algorithm, called AutoCSF, for combining CSFs with pre-filtering to provably handle skewed distributions with near-optimal space usage. We improve upon prior CSF pre-filtering constructions by (1) deriving a mathematically rigorous decision criterion for when filter augmentation is beneficial; (2) presenting a general algorithmic framework for integrating CSFs with modern set membership data structures beyond the classic Bloom filter; and (3) establishing theoretical guarantees on the overall space usage of the resulting indexes. Our open-source implementation of AutoCSF demonstrates space savings over baseline methods while maintaining low query latency.
Ultra-high-throughput mapping of genetic design space
Nature · 2026-01-14 · 4 citations
articleMin-frame transformation enables more sensitive viral genome alignment
bioRxiv (Cold Spring Harbor Laboratory) · 2026-05-22
articleOpen accessSenior authorCorrespondingMotivation: Maximal unique matches (MUMs) are a fundamental primitive in genome comparison, where they serve as high-confidence anchors for downstream multiple genome alignment. However, because MUMs rely on exact string matching, their effectiveness degrades with increased genome divergence and larger sets of genomes, inhibiting their ability to recover long homologous regions and reducing the number of base pairs covered by the multiple genome alignment. Additionally, existing approaches that improve robustness to mutation, such as spaced seeds or translated alignment methods, introduce trade-offs in specificity, scalability, or computational complexity. Methods: To address this gap, we introduce the Min-Frame Transformation (MFT), a deterministic encoding of nucleotide sequences to sequences over a transformed alphabet that preserves the coordinate structure of the original sequence. At each position, the MFT selects a \kmer from a local window according to a fixed global ordering and assigns it a character in the transformed alphabet via a predefined mapping. This process captures local sequence context and can mask the impact of mutations, increasing the likelihood that homologous regions remain detectable as exact matches. The resulting transformed sequences can be indexed using standard string data structures, such as suffix arrays and suffix trees, enabling efficient extraction of MUMs without modifying existing algorithms. Impact: The MFT is a novel computational approach for improving the robustness of MUM-based seeding for genome alignment by producing longer and more contiguous matches that span a greater fraction of the genome, leading to improved alignment coverage and SNP recall. Altogether, these improvements have the potential to result in improvements for downstream viral genome analysis applications such as phylogenetic inference and transmission analysis.
Journal of Neuroinflammation · 2025-04-20 · 22 citations
articleOpen accessBACKGROUND: Recent studies have highlighted the potential influence of gut dysbiosis on traumatic brain injury (TBI) outcomes. Alterations in the abundance and diversity of Lactobacillus species may affect immune dysregulation, neuroinflammatory responses, anxiety- and depressive-like behaviors, and neuroprotective mechanisms activated in response to TBI. OBJECTIVE: This study aims to evaluate the protective and preventive effects of Pan-probiotic (PP) treatment on the inflammatory response during both the acute and chronic phases of TBI. METHODS: Males and female mice underwent controlled cortical impact (CCI) injury or sham. They received a PP mixture in drinking water containing strains of Lactobacillus plantarum, L. reuteri, L. helveticas, L. fermentum, L. rhamnosus, L. gasseri, and L. casei. In the acute group, mice received PP or vehicle (VH) treatment for 7 weeks before TBI, continuing until 3 days post-injury (dpi). In the chronic group, treatment began 2 weeks before TBI and was extended through 35 dpi. The taxonomic microbiome profiles of fecal samples were evaluated using 16S rRNA V1-V3 sequencing analysis, and Short-chain fatty acids (SCFAs) were measured. Immunohistochemical, in situ hybridization, and histological analyses were performed to assess neuroinflammation post-TBI, while behavioral assessments were conducted to evaluate sensorimotor and cognitive functions. RESULTS: Our findings suggest that a 7-week PP administration induces specific microbial changes, including increased abundance of beneficial bacteria such as Lactobacillaceae, Limosilactobacillus, and Lactiplantibacillus. PP treatment reduces lesion volume and cell death at 3 dpi, elevates SCFA levels at 35 dpi, and decreases microglial activation at both time points, particularly in males. Additionally, PP treatment improved motor recovery in males and alleviated depressive-like behaviors in females. CONCLUSION: Our findings indicate that PP administration modulates microbiome composition, reduces neuroinflammation, and improves motor deficits following TBI, with these effects being particularly pronounced in male mice.
Strainify: Strain-Level Microbiome Profiling for Low-Coverage Short-Read Metagenomic Datasets
bioRxiv (Cold Spring Harbor Laboratory) · 2025-10-13 · 1 citations
preprintOpen accessSenior authorCorrespondingABSTRACT Motivation Strain-level microbiome profiling has revealed key insights into microbial community composition and strain dynamics. However, accurate strain-level analysis remains challenging due to limited linkage information, ambiguous read mapping, and complicating factors such as genome similarity, sequencing depth, and community complexity. These challenges are especially pronounced for short-read metagenomic data when estimating the relative abundances of multiple strains, a task critical for genotype-phenotype association studies. Results To address this gap, we present Strainify, which enables accurate strain-level abundance estimation from short-read metagenomes with as little as 1% genome coverage. Specifically, Strainify combines (1) identification of informative variants via core genome alignment, (2) filtering of confounding variants via a window-based test, and (3) maximum likelihood estimation of strain abundances. A Shannon entropy-weighted version of the model further improves robustness in noisy, low-coverage settings by downweighting sites with low information content. Across simulated communities of varying complexity, Strainify consistently outperformed existing approaches. On mock community sequencing data, Strainify’s estimates aligned more closely with reference abundances. When applied to a longitudinal gut microbiome dataset, Strainify successfully recapitulated the reported temporal dynamics of Bacteroides ovatus strain groups, demonstrating its ability to recover biologically meaningful patterns from real-world metagenomes. Together, these results establish Strainify as a robust and versatile solution for accurate strain-level abundance estimation in short-read, low-coverage microbiome studies. Availability The Strainify code and results are available at: https://github.com/treangenlab/Strainify
Inter-tool Analysis of a NIST Dataset for Assessing Baseline Nucleic Acid Sequence Screening
Applied Biosafety · 2025-12-26
articlebronko: ultrafast, alignment-free detection of viral genome variation
bioRxiv (Cold Spring Harbor Laboratory) · 2025-12-02
articleOpen accessSenior authorCorrespondingAs viral sequencing datasets continue to grow, traditional alignment-based variant calling pipelines are becoming computationally prohibitive. To address these challenges, we developed bronko , an ultrafast alignment-free framework for detecting viral variation directly from sequencing data. The novel computational approach implemented in bronko allows scaling to massive viral sequencing datasets and has three key components: i) a locality-sensitive bucketing function to rapidly identify single-nucleotide polymorphisms (SNPs) relative to reference(s), ii) a direct k-mer count psuedo-mapping approach that approximates a pileup without alignment, and iii) a streaming-based sliding window outlier test to estimate baseline noise across the genome and precisely differentiate real minor variants from noise. Together, these components yield near-linear computational complexity with respect to sequencing depth, enabling bronko to process thousands of viral samples rapidly on modest hardware. Our results are threefold: 1) On simulated amplicon sequencing, bronko recovers variants with higher precision and comparable recall to existing tools while running up to one to three orders of magnitude faster; 2) bronko generates sequence alignments directly from sequencing data, with SNP content similar to that of whole-genome alignment while also running in a fraction of the time, and 3) applying bronko to longitudinal sequencing data from chronically infected SARS-CoV-2 patients revealed consistent patterns of intrahost diversification and adaptive mutations over time. Altogether, these results demonstrate bronko 's potential as a scalable tool for large-scale viral genomic analyses, overcoming longstanding computational barriers for intrahost and interhost characterization of viral variation.
Recent grants
Frequent coauthors
- 42 shared
Mihai Pop
University of Maryland, College Park
- 37 shared
Sergey Koren
National Human Genome Research Institute
- 36 shared
R. A. Leo Elworth
Rice University
- 35 shared
Nicolae Sapoval
Rice University
- 31 shared
Sonia Villapol
Houston Methodist
- 30 shared
Kjersti M. Aagaard
Kamuzu Central Hospital
- 30 shared
Yunxi Liu
Rice University
- 30 shared
Michael D. Jochum
Baylor College of Medicine
Labs
Treangen lab at Rice UniversityPI
Treangen lab members
Awards & honors
- NSF CAREER Award, 2023
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Todd Treangen
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup