Upload your resume. PhdFit's six research agents compare your background with faculty profiles, recent publications, lab focus, and outreach opportunities, then rank professors with evidence you can review.
Ask how her lab is extending interpretability methods into fairness audits for real-world AI systems.

University of Pennsylvania · Rehabilitation Medicine
Active 2015–2026
Glennis Logsdon, Ph.D., is an Assistant Professor of Genetics at the University of Pennsylvania's Perelman School of Medicine. She is a core member of the Penn Epigenetics Institute and a member of the Penn Center for Global Genomics and Health Equity. Her research focuses on using long-read sequencing, innovative computational methods, and synthetic biology approaches to investigate the sequence and structure of regions of the human genome that have remained unresolved for the past two decades, with a particular emphasis on centromeres. Her work has contributed to the complete sequencing of all human centromeres and the human genome, enabling the study of their variation, evolution, and role in disease. During her postdoctoral training, Dr. Logsdon developed long-read sequencing methods and novel computational approaches that led to the first complete sequence of a human autosomal centromere. Her laboratory aims to uncover genetic and epigenetic variation of centromeres among human populations and in diseased individuals, develop models of centromere variation, and study their basic biology and function. Additionally, her research includes reconstructing the evolutionary history of centromeres over the last 25 million years using phylogenetic and comparative approaches with human and non-human primate species. She plans to apply her discoveries to design and engineer new centromeres on human artificial chromosomes, which has the potential to revolutionize scientific research and medicine through the design of custom chromosomes and genomes.
A complete human pancreatic cancer genome
bioRxiv (Cold Spring Harbor Laboratory) · 2026-05-06 · 1 citations
Summary Cancer genome sequencing is essential for understanding tumor evolution and advancing precision medicine. 1 However, reference gaps and germline variants obscure detection of small and large somatic variants and methylation in repetitive regions. 1–3 It is common for tumor cells to gain or lose chromosome arms due to somatic structural changes that occur inside highly repetitive satellite DNA sequences in the centromeres. 4 To identify the full spectrum of somatic variants, including complex rearrangements, we construct and curate near-complete, haplotype-resolved assemblies of the most recent common ancestor of an early-passage broadly-consented hypodiploid pancreatic cancer cell line and matched normal tissues. The tumor assembly completely recapitulates all 35 tumor chromosomes observed with karyotyping, with multiple translocation-induced hybrid chromosomes. The hybrid chromosomes contain putative functional dicentric and fused centromeres, nested foldback inversions causing 14 breakpoints with a haplotype switch in a single event, and centromeric satellite tandem duplications up to 136 kbp. Direct comparison of tumor and normal assembly haplotypes uncovers >7,000 variants altering >1 Mbp of sequence in repetitive regions that have been hidden by reference gaps and germline variants. 44 % of somatic small variants change representation because they alter germline variants on GRCh38, impacting mutational signatures and kataegis/omikli clusters. Most somatic LINE insertions originate from two hypomethylated non-reference germline LINE insertions, highlighting their impact on insertion mutation burden. These assemblies demonstrate that centromeric, acrocentric, and telomeric regions conventionally excluded from analysis harbor extensive somatic and epigenetic changes. Resolving complete tumor genomes enables a deeper understanding of cancer structural plasticity and the endpoints of breakage-fusion-bridge cycles. These assembled, curated paired normal-tumor benchmarks will serve as a critical foundation for developing future algorithms to characterize the most intractable regions of cancer genomes.
A segmental duplication-mediated deletion leads to neocentromere formation in orangutans
bioRxiv (Cold Spring Harbor Laboratory) · 2026-04-11
Centromeres ensure faithful chromosome segregation, yet how new centromeres arise and replace canonical ones remains poorly understood. Here, we investigate a polymorphic centromere repositioning event on the orangutan chromosome 10 using near-telomere-to-telomere assemblies, epigenetic profiling, and population-scale data. We identify striking heterogeneity in canonical centromeres, ranging from large, higher-order repeat α-satellite arrays to short, monomeric α-satellite tracts, alongside the emergence of neocentromeres lacking α-satellite DNA. We show a segmental duplication-mediated deletion of 3.6 Mbp that removed the higher-order repeat array, promoting centromere repositioning and neocentromere formation. Phylogenetic analyses reveal complex evolutionary dynamics, including introgression and incomplete lineage sorting in orangutan lineages. These findings demonstrate that centromere identity can evolve through structural variation and epigenetic reprogramming, highlighting its remarkable plasticity in primate genomes.
Human de novo mutation rates from a four-generation pedigree reference
Nature · 2025-04-23 · 71 citations
. Here using five complementary short-read and long-read sequencing technologies, we phased and assembled more than 95% of each diploid human genome in a four-generation, twenty-eight-member family (CEPH 1463). We estimate 98-206 DNMs per transmission, including 74.5 de novo single-nucleotide variants, 7.4 non-tandem repeat indels, 65.3 de novo indels or structural variants originating from tandem repeats, and 4.4 centromeric DNMs. Among male individuals, we find 12.4 de novo Y chromosome events per generation. Short tandem repeats and variable-number tandem repeats are the most mutable, with 32 loci exhibiting recurrent mutation through the generations. We accurately assemble 288 centromeres and six Y chromosomes across the generations and demonstrate that the DNM rate varies by an order of magnitude depending on repeat content, length and sequence identity. We show a strong paternal bias (75-81%) for all forms of germline DNM, yet we estimate that 16% of de novo single-nucleotide variants are postzygotic in origin with no paternal bias, including early germline mosaic mutations. We place all this variation in the context of a high-resolution recombination map (~3.4 kb breakpoint resolution) and find no correlation between meiotic crossover and de novo structural variants. These near-telomere-to-telomere familial genomes provide a truth set to understand the most fundamental processes underlying human genetic variation.
RAmbler resolves complex repeats in human Chromosomes 8, 19, and X
Genome Research · 2025-03-04 · 3 citations
Repetitive regions in eukaryotic genomes often contain important functional or regulatory elements. Despite significant algorithmic and technological advancements in genome sequencing and assembly over the past three decades, modern de novo assemblers still struggle to accurately reconstruct highly repetitive regions. In this work, we introduce RAmbler (Repeat Assembler), a reference-guided assembler specialized for the assembly of complex repetitive regions exclusively from Pacific Biosciences (PacBio) HiFi reads. RAmbler (1) identifies repetitive regions by detecting unusually high coverage regions after mapping HiFi reads to the draft genome assembly, (2) finds single-copy k -mers from the HiFi reads, (i.e., k -mers that are expected to occur only once in the genome), (3) uses the relative location of single-copy k -mers to barcode each HiFi read, (4) clusters HiFi reads based on their shared barcodes, (5) generates contigs by assembling the reads in each cluster, and (6) generates a consensus assembly from the overlap graph of the assembled contigs. Here, we show that RAmbler can reconstruct human centromeres and other complex repeats to a quality comparable to the manually curated Telomere-to-Telomere human genome assembly. Across more than 250 synthetic data sets, RAmbler outperforms hifiasm, LJA, HiCANU, and Verkko across various parameters such as repeat lengths, number of repeats, heterozygosity rates, and depth of sequencing.
bioRxiv (Cold Spring Harbor Laboratory) · 2025-03-02 · 4 citations
KRAB-zinc finger proteins (KZFPs) comprise the largest family of mammalian transcription factors, rapidly evolving within and between species. Most KZFPs repress endogenous retroviruses (ERVs) and other retrotransposons, with KZFP gene numbers correlating with the ERV load across species, suggesting coevolution. How new KZFPs emerge in response to ERV invasions is currently unknown. Using a combination of long-read sequencing technologies and genome assembly, we present a first detailed comparative analysis of young KZFP gene clusters in the mouse lineage, which has undergone recent KZFP gene expansion and ERV infiltration. Detailed annotation of KZFP genes in a cluster on Mus musculus Chromosome 4 revealed parallel expansion and diversification of this locus in different mouse strains (C57BL/6J, 129S1/SvImJ and CAST/EiJ) and species (Mus spretus and Mus pahari). Our data supports a model by which new ERV integrations within young KZFP gene clusters likely promoted recombination events leading to the emergence of new KZFPs that repress them. At the same time, ERVs also increased their numbers by duplication instead of retrotransposition alone, unraveling a new mechanism for ERV enrichment at these loci.
Integrated analysis of the complete sequence of a macaque genome
Nature · 2025-02-26 · 29 citations
Conservation of dichromatin organization along regional centromeres
Cell Genomics · 2025-03-27 · 15 citations
The attachment of the kinetochore to the centromere is essential for genome maintenance, yet the highly repetitive nature of satellite regional centromeres limits our understanding of their chromatin organization. We demonstrate that single-molecule chromatin fiber sequencing (Fiber-seq) can uniquely co-resolve kinetochore and surrounding chromatin architectures along point centromeres, revealing largely homogeneous single-molecule kinetochore occupancy. In contrast, the application of Fiber-seq to regional centromeres exposed marked per-molecule heterogeneity in their chromatin organization. Regional centromere cores uniquely contain a dichotomous chromatin organization (dichromatin) composed of compacted nucleosome arrays punctuated with highly accessible chromatin patches. CENP-B occupancy phases dichromatin to the underlying alpha-satellite repeat within centromere cores but is not necessary for dichromatin formation. Centromere core dichromatin is conserved between humans and primates, including along regional centromeres lacking satellite repeats. Overall, the chromatin organization of regional centromeres is defined by marked per-molecule heterogeneity, buffering kinetochore attachment against sequence and structural variability within regional centromeres.
Complete sequencing of ape genomes
Nature · 2025-04-09 · 119 citations
. Consequently, our understanding of the evolution of our species is incomplete. Here we present haplotype-resolved reference genomes and comparative analyses of six ape species: chimpanzee, bonobo, gorilla, Bornean orangutan, Sumatran orangutan and siamang. We achieve chromosome-level contiguity with substantial sequence accuracy (<1 error in 2.7 megabases) and completely sequence 215 gapless chromosomes telomere-to-telomere. We resolve challenging regions, such as the major histocompatibility complex and immunoglobulin loci, to provide in-depth evolutionary insights. Comparative analyses enabled investigations of the evolution and diversity of regions previously uncharacterized or incompletely studied without bias from mapping to the human reference genome. Such regions include newly minted gene families in lineage-specific segmental duplications, centromeric DNA, acrocentric chromosomes and subterminal heterochromatin. This resource serves as a comprehensive baseline for future evolutionary studies of humans and our closest living ape relatives.
bioRxiv (Cold Spring Harbor Laboratory) · 2025-12-25
Human neocentromeres are functional centromeres demarcated by CENP-A nucleosomes that form ectopically at alpha satellite-free loci. How neocentromeres reshape local chromatin and which features of native centromeric chromatin are preserved are unknown. We generated gapless, haplotype-resolved assemblies of native and neocentromeres from three patient-derived cell lines. Integrating CpG methylation, CENP-A profiling, and single-molecule chromatin fiber sequencing, we reveal chromatin features that define the essential centromeric architecture reconstituted during neocentromere establishment. We find that a deletion within the satellite array encompassing the hypo-CpG methylation centromere dip regions (CDRs) led to native centromere inactivation, that neocentromeres harbor CDRs and a dichromatin architecture, recapitulating features of alpha-satellite centromeres, and that LINEs demarcate neocentromere boundaries, implicating transposable elements in restricting CENP-A domain spreading. Moreover, neocentromeric chromatin is incompatible with promoter-like chromatin states, redefining the regulatory landscape within genic regions. Finally, using haplotype-specific chromatin footprinting, we resolve CENP-A nucleosome chromatin architecture of active centromeres.
A complete diploid human genome benchmark for personalized genomics
bioRxiv (Cold Spring Harbor Laboratory) · 2025-09-21 · 20 citations
Human genome resequencing typically involves mapping reads to a reference genome to call variants; however, this approach suffers from both technical and reference biases, leaving many duplicated and structurally polymorphic regions of the genome unmapped. Consequently, existing variant benchmarks, generated by the same methods, fail to assess these complex regions. To address this limitation, we present a telomere-to-telomere genome benchmark that achieves near-perfect accuracy (i.e. no detectable errors) across 99.4% of the complete, diploid HG002 genome. This benchmark adds 701.4 Mb of autosomal sequence and both sex chromosomes (216.8 Mb), totaling 15.3% of the genome that was absent from prior benchmarks. We also provide a diploid annotation of genes, transposable elements, segmental duplications, and satellite repeats, including 39,144 protein-coding genes across both haplotypes. To facilitate application of the benchmark, we developed tools for measuring the accuracy of sequencing reads, phased variant call sets, and genome assemblies against a diploid reference. Genome-wide analyses show that state-of-the-art de novo assembly methods resolve 2-7% more sequence and outperform variant calling accuracy by an order of magnitude, yielding just one error per 100 kb across 99.9% of the benchmark regions. Adoption of genome-based benchmarking is expected to accelerate the development of cost-effective methods for complete genome sequencing, expanding the reach of genomic medicine to the entire genome and enabling a new era of personalized genomics.
Centromere Sequence, Variation, and Function
NIH · $195k · 2019–2022
Evan E. Eichler
University of Washington
Adam M. Phillippy
National Human Genome Research Institute
Sergey Koren
National Human Genome Research Institute
Arang Rhie
National Human Genome Research Institute
David Porubský
University of Washington
Ph.D., Biochemistry and Molecular Biophysics
University of Pennsylvania Perelman School of Medicine
B.A., Biochemistry
University of Pennsylvania
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
Sergey Nurk
Oxford Nanopore Technologies (United Kingdom)
Mitchell R. Vollger
University of Washington Medical Center
Katherine M. Munson
University of Washington