Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Sergei L Kosakovsky Pond

Sergei L Kosakovsky Pond

Verified

University of California, San Diego · Software Engineering

Active 2002–2026

h-index22
Citations3.2k
Papers402 last 5y
Funding
See your match with Sergei L Kosakovsky Pond — sign in to PhdFit.Sign in

About

Sergei L Kosakovsky Pond is an Associate Dean for Research & Innovation at the College of Science & Technology, Temple University. He served as Associate Dean from 1998 to 2003 and has been involved in research related to the modeling evolution of protein coding DNA sequences. His academic background includes a PhD, with his dissertation focusing on modeling the evolution of protein coding DNA sequences under the guidance of advisor Joseph C. Watkins. His work emphasizes the development and application of mathematical and computational methods to understand biological evolution, particularly at the molecular level.

Research topics

  • Chemistry
  • Photochemistry
  • Biology
  • Materials science
  • Computational biology

Selected publications

  • Beyond Invariable Sites: Using Evolutionary Stasis to Map Multi-Layered Constraints on the Evolution of Viral and Mammalian Genomes

    bioRxiv (Cold Spring Harbor Laboratory) · 2026-04-10

    articleOpen access1st author

    The quantification of genomic conservation has progressed from foundational statistical modeling of evolutionary rates to state-of-the-art phylogeny-aware deep learning architectures. Yet, a fundamental resolution gap remains whenever evolutionary rates closely approach the "zero-rate origin," where standard selection inference tools will essentially ignore signals of extreme purifying section at invariant genome sites. We present B-STILL (Bayesian Significance Test of Invariant Low Likelihoods), a hierarchical Bayesian framework designed to resolve the selective landscape of protein-coding data by leveraging gene-level calibration and codon-site specific evolutionary opportunity. This framework is based on computationally efficient approximations using codon-substitution models which are scalable to alignments with thousands of sequences. By explicitly tuning the stasis radius around the near-zero evolutionary-rate regime, B-STILL distinguishes between stochastic invariance and functional constraint, identifying Evolutionary Stasis Anchors (ESAs) where the upper bound on permitted evolutionary change is statistically anomalous relative to the background of the gene. This hierarchical approach provides a signature of functional or structural constraint that is often difficult to detect using other tools. Validation against extensive pathogen and clinical databases confirms that ESAs are predictors of biological fitness and disease potential. Collectively, we identified thousands of significantly clustered ESAs that precisely footprint both known functional domains and currently uncharacterized structural motifs in mammalian and viral genomes. These findings establish B-STILL as a scalable statistical framework for high-resolution genomic annotation, transforming formerly ignored invariant genome and protein sites into informative markers of extreme purifying selection across both well-characterized and uncharacterized protein-coding genes from different domains of life.

  • Changing the Optics: Comparing Traditional and Retrieval-Augmented GenAI E-Tutorials in Interdisciplinary Learning

    Open MIND · 2026-02-24

    preprint

    Understanding information-seeking behaviors in e-learning is critical, as learners must often make sense of complex and fragmented information, a challenge compounded in interdisciplinary fields with diverse prior knowledge. Compared to traditional e-tutorials, GenAI e-tutorials offer new ways to navigate information spaces, yet how they shape learners information-seeking behaviors remains unclear. To address this gap, we characterized behavioral differences between traditional and GenAI-mediated e-tutorial learning using the three search modes of orienteering. We conducted a between-subject study in which learners engaged with either a traditional e-tutorial or a GenAI e-tutorial accessing the same underlying information content. We found that the traditional users maintained greater awareness and focus of the information space, whereas GenAI users exhibited more proactive and exploratory behaviors with lower cognitive load due to the querying-driven interaction. These findings offer guidance for designing tutorials in e-learning.

  • Dynamics of natural selection preceding human viral epidemics and pandemics

    Cell · 2026-03-06 · 2 citations

    articleOpen access

    Using a phylogenetic framework to characterize natural selection, we investigate the hypothesis that zoonotic viruses require adaptation prior to zoonosis to sustain human-to-human transmission. Examining the zoonotic emergence of Ebola virus, Marburg virus, mpox virus, influenza A virus, and SARS-CoV-2, we find no evidence of a change in selection intensity immediately prior to outbreaks in humans compared with typical selection within reservoir hosts. We found a change in selection on SARS-CoV in an intermediate host. We conclude that extensive pre-zoonotic adaptation is not necessary for human-to-human transmission of zoonotic viruses. In contrast, the reemergence of H1N1 influenza A virus in 1977 was preceded by a shift in selection intensity, consistent with the hypothesis of passage in a laboratory setting. Holistic phylogenetic analysis of selection regimes can be used to detect evolutionary signals of host switching or laboratory passage, providing insight into the circumstances of past and future viral emergence.

  • Changing the Optics: Comparing Traditional and Retrieval-Augmented GenAI E-Tutorials in Interdisciplinary Learning

    arXiv (Cornell University) · 2026-02-24

    articleOpen access

    Understanding information-seeking behaviors in e-learning is critical, as learners must often make sense of complex and fragmented information, a challenge compounded in interdisciplinary fields with diverse prior knowledge. Compared to traditional e-tutorials, GenAI e-tutorials offer new ways to navigate information spaces, yet how they shape learners information-seeking behaviors remains unclear. To address this gap, we characterized behavioral differences between traditional and GenAI-mediated e-tutorial learning using the three search modes of orienteering. We conducted a between-subject study in which learners engaged with either a traditional e-tutorial or a GenAI e-tutorial accessing the same underlying information content. We found that the traditional users maintained greater awareness and focus of the information space, whereas GenAI users exhibited more proactive and exploratory behaviors with lower cognitive load due to the querying-driven interaction. These findings offer guidance for designing tutorials in e-learning.

  • Viral genome sequence datasets display pervasive evidence of strand-specific substitution biases that are best described using non-reversible nucleotide substitution models

    eLife · 2025-08-28

    articleOpen access

    Most phylogenetic trees are inferred using time-reversible evolutionary models that assume that the relative rates of substitution for any given pair of nucleotides are the same regardless of the direction of the substitutions. However, there is no reason to assume that the underlying biochemical mutational processes that cause substitutions are similarly symmetrical. We consider two non-reversible nucleotide substitution models: (1) a 6-rate non-reversible model (NREV6) that is applicable to analysing mutational processes in double-stranded genomes, in that complementary substitutions occur at identical rates and (2) a 12-rate non-reversible model (NREV12) that is applicable to analysing mutational processes in single-stranded (ss) genomes, in that all substitution types are free to occur at different rates. Using likelihood ratio and Akaike information criterion-based model tests, we show that, surprisingly, NREV12 provided a significantly better fit than the general time reversible (GTR) and NREV6 models to 21/31 dsRNA and 20/30 dsDNA datasets. As expected, however, NREV12 provided a significantly better fit to 24/33 ssDNA and 40/47 ssRNA datasets. We tested how non-reversibility impacts the accuracy with which phylogenetic trees are inferred. As simulated degrees of non-reversibility (DNRs) increased, the tree topology inferences using both NREV12 and GTR became more accurate, whereas inferred tree branch lengths became less accurate. We conclude that while non-reversible models should be helpful in the analysis of mutational processes in most virus species, there is no pressing need to use these models for routine phylogenetic inference.

  • A New Comparative Framework for Estimating Selection on Synonymous Substitutions

    Molecular Biology and Evolution · 2025-03-23

    articleOpen accessSenior author

    Selection on synonymous codon usage is a well-known and widespread phenomenon, yet existing models often do not account for it or its effect on synonymous substitution rates. In this article, we develop and expand the capabilities of multiclass synonymous substitution (MSS) models, which account for such selection by partitioning synonymous substitutions into 2 or more classes and estimating a relative substitution rate for each class, while accounting for important confounders like mutation bias. We identify extensive heterogeneity among relative synonymous substitution rates in an empirical dataset of ∼12,000 gene alignments from 12 Drosophila species. We validate model performance using data simulated under a forward population genetic simulation, demonstrating that MSS models are robust to model misspecification. MSS rates are significantly correlated with other covariates of selection on codon usage (population-level polymorphism data and tRNA abundance data), suggesting that models can detect weak signatures of selection on codon usage. With the MSS model, we can now study selection on synonymous substitutions in diverse taxa, independent of any a priori assumptions about the forces driving that selection.

  • Minus the Error: Testing for Positive Selection in the Presence of Residual Alignment Errors

    eLife · 2025-06-26 · 1 citations

    preprintOpen accessSenior author

    Abstract Positive selection is an evolutionary process which increases the frequency of advantageous mutations because they confer a fitness benefit. Inferring the past action of positive selection on protein-coding sequences is fundamental for deciphering phenotypic diversity and the emergence of novel traits. With the advent of genome-wide comparative genomic datasets, researchers can analyze selection not only at the level of individual genes but also globally, delivering systems-level insights into evolutionary dynamics. However, genome-scale datasets are generated with automated pipelines and imperfect curation that does not eliminate all sequencing, annotation, and alignment errors. Positive selection inference methods are highly sensitive to such errors. We present BUSTED-E: a method designed to detect positive selection for amino acid diversification while concurrently identifying some alignment errors. This method builds on the flexible branch-site random effects model (BUSTED) for fitting distributions of dN/dS, with a critical modification: it incorporates an “error-sink” component to represent an abiological evolutionary regime. Using several genome-scale biological datasets that were extensively filtered using state-of-the art automated alignment tools, we show that BUSTED-E identifies pervasive residual alignment errors, produces more realistic estimates of positive selection, reduces bias, and improves biological interpretation. The BUSTED-E model promises to be a more stringent filter to identify positive selection in genome-wide contexts, thus enabling further characterization and validation of the most biologically relevant cases.

  • MoleRate: comparing molecular relative evolutionary rates to detect convergent evolution

    Evolution · 2025-12-11

    articleOpen accessSenior author

    In comparative evolutionary genomics, faster or slower evolution of a particular gene, site, or branch in a phylogenetic tree, when compared to the appropriate average, has been interpreted as evidence of conservation, functional importance, or adaptation. With large consortia generating hundreds of genomes, there is an opportunity to interrogate these datasets for evidence of accelerated or reduced evolutionary rates in protein-coding genes associated with the presence or absence of a given phenotype (e.g., marine vs. terrestrial, nocturnal vs. diurnal). Such rate shifts can reflect the molecular basis of convergent phenotypic adaptation when they occur repeatedly across independent lineages. Here, we introduce an explicit phylogenetic rate test, MoleRate, for acceleration or reduction of nucleotide or protein evolutionary rates in focal lineages vs. the rest of the phylogeny. Compared to existing methods, MoleRate offers execution, explicit likelihood-based hypothesis testing, and the ability to detect and filter out potentially aberrant signal from single lineages. We demonstrate MoleRate's performance on simulated and empirical data, and apply it to several mammalian phenotypes. We also highlight its visualization capabilities, which enable exploration and communication of results. These analyses show that MoleRate detects biologically significant enrichments in selective pressure on specific functions related to the given phenotype, and that enrichments in selective pressure related to the given phenotype, absent when random lineages are tested.

  • Author response: Viral genome sequence datasets display pervasive evidence of strand-specific substitution biases that are best described using non-reversible nucleotide substitution models

    2025-09-30

    peer-reviewOpen access
  • HIV-1 Rebound Virus Consists of a Small Number of Lineages That Entered the Reservoir Close to ART Initiation

    bioRxiv (Cold Spring Harbor Laboratory) · 2025-01-31 · 3 citations

    preprintOpen access

    Abstract HIV-1 persists as a latent reservoir during suppressive antiretroviral therapy (ART). Viral rebound occurs upon ART interruption, posing a challenge to cure efforts. Characterizing viral populations fuelling rebound is imperative to curing HIV-1. We used longitudinal samples collected pretherapy from women in the CAPRISA 002 cohort to create an evolutionary time- line to determine the pretherapy timepoint when the rebound virus originally entered the long- lived reservoir. Participants (N=10) were untreated for an average of 5 years then on ART for an average of 2 years before viral rebound (defined as >1000 RNA copies/ml). env sequences were used to characterize the longitudinal pre-ART evolving viral RNA population, the proviral DNA reservoir during ART, and viral RNA in the plasma during rebound. For each participant, between 1 and 3 major viral lineages were identified in the plasma during rebound. A total of 20 rebound virus lineages were examined for the 10 participants, and 19 were found to have entered the reservoir around the time of therapy initiation. The one lineage estimated to enter the reservoir more than a year before therapy was observed in a participant who was untreated for more than 8 years, yet retained moderate CD4 T cell counts. Analysis of the viral DNA reservoir, from which the rebound viruses emanated, revealed that while 95% of rebounding lineages dated to the year before ART initiation, only 61% of unique proviruses dated to that time period. Strikingly, for three participants with DNA reservoirs dominated by viruses from earlier in untreated infection, only 33% of unique proviruses dated to the year before ART initiation, yet 83% of rebounding lineages dated to that time. Our results show that rebound virus almost exclusively comes from the portion of the latent reservoir that formed around the time of therapy initiation, even when the reservoir is composed of diverse sequences from across the pre-ART time period. Author Summary HIV-1 is maintained in a long-lived reservoir during suppressive therapy. Virus rebounds if therapy is discontinued. We found that in most cases rebound virus comes from a pool of viral sequences that entered the long-lived reservoir around the time of therapy initiation. While the viral DNA reservoir is on average also skewed toward sequences replicating around the time of therapy initiation, the rebound virus almost exclusively comes from this portion of the latent reservoir, even when the reservoir contained proviruses from much earlier in untreated infection. Thus, we hypothesize that there are features of the viruses forming the latent reservoir around the time of therapy initiation, or features of the host at that time, that select these viruses as initiators of rebound during therapy discontinuation.

Frequent coauthors

Education

  • PhD, Applied Mathematics

    University of Arizona

    2003
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Sergei L Kosakovsky Pond

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup