Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Olgica  Milenkovic

Olgica Milenkovic

· Professor, Electrical and Computer EngineeringVerified

University of Illinois Urbana-Champaign · Computer Science

Active 1996–2026

h-index44
Citations8.8k
Papers447111 last 5y
Funding$6.5M
See your match with Olgica Milenkovic — sign in to PhdFit.Sign in

About

Olgica Milenkovic is a Professor in the Department of Electrical and Computer Engineering at the University of Illinois Urbana-Champaign. She holds a PhD in Electrical and Computer Engineering and an MSc in Mathematics from the University of Michigan, Ann Arbor. Her research focuses on developing new approaches for studying problems in bioinformatics and bioengineering using coding and information theory, including the design of DNA microarrays with error- and quality-control features and DNA microarrays utilizing compressed sensing principles. She investigates the connections between compressed sensing, superimposed coding, and non-linear compressive sensing with quantization and fault-tolerant algorithms, applying these methods to problems such as RNA folding, gene-regulatory network reverse engineering, and genome reversal distances. Additionally, her work involves constructing and analyzing codes on graphs, studying the combinatorial properties of low-density parity-check codes, and exploring the relationships between network coding, matroid theory, and algebraic coding theory. Milenkovic is also engaged in analyzing the average case complexity of algorithms in coding theory and computer algebra.

Research topics

  • Computer Science
  • Machine Learning
  • Data Mining
  • Mathematics
  • Data science
  • Human–computer interaction
  • Biology
  • Engineering
  • Genetics
  • Econometrics
  • Computational biology
  • Statistics

Selected publications

  • Spatially-Coupled Network RNA Velocities: A Control-Theoretic Perspective

    arXiv (Cornell University) · 2026-01-03

    preprintOpen accessSenior author

    RNA velocity is an important model that combines cellular spliced and unspliced RNA counts to infer dynamical properties of various regulatory functions. Despite its wide applicability and many variants used in practice, the model has not been adequately designed to directly account for both intracellular gene regulatory network interactions and spatial intercellular communications. Here, we propose a new RNA velocity approach that jointly and directly captures two new network structures: an intracellular gene regulatory network (GRN) and an intercellular interaction network that captures interactions between (neighboring) cells, with relevance to spatial transcriptomics. We theoretically analyze this two-level network system through the lens of control and consensus theory. In particular, we investigate network equilibria, stability, cellular network consensus, and optimal control approaches for targeted drug intervention.

  • Correcting Contextual Deletions in DNA Nanopore Readouts

    arXiv (Cornell University) · 2026-02-04

    articleOpen access

    The problem of designing codes for deletion-correction and synchronization has received renewed interest due to applications in DNA-based data storage systems that use nanopore sequencers as readout platforms. In almost all instances, deletions are assumed to be imposed independently of each other and of the sequence context. These assumptions are not valid in practice, since nanopore errors tend to occur within specific contexts. We study contextual nanopore deletion-errors through the example setting of deterministic single deletions following (complete) runlengths of length at least $k$. The model critically depends on the runlength threshold $k$, and we examine two regimes for $k$: a) $k=C\log n$ for a constant $C\in(0,1)$; in this case, we study error-correcting codes that can protect from a constant number $t$ of contextual deletions, and show that the minimum redundancy (ignoring lower-order terms) is between $(1-C)t\log n$ and $2(1-C)t\log n$, meaning that it is a ($1-C$)-fraction of that of arbitrary $t$-deletion-correcting codes. To complement our non-constructive redundancy upper bound, we design efficiently and encodable and decodable codes for any constant $t$. In particular, for $t=1$ and $C>1/2$ we construct efficient codes with redundancy that essentially matches our non-constructive upper bound; b) $k$ equal a constant; in this case we consider the extremal problem where the number of deletions is not bounded and a deletion is imposed after every run of length at least $k$, which we call the extremal contextual deletion channel. This combinatorial setting arises naturally by considering a probabilistic channel that introduces contextual deletions after each run of length at least $k$ with probability $p$ and taking the limit $p\to 1$. We obtain sharp bounds on the maximum achievable rate under the extremal contextual deletion channel for arbitrary constant $k$.

  • Spatially-Coupled Network RNA Velocities: A Control-Theoretic Perspective

    ArXiv.org · 2026-01-03

    articleOpen accessSenior author

    RNA velocity is an important model that combines cellular spliced and unspliced RNA counts to infer dynamical properties of various regulatory functions. Despite its wide applicability and many variants used in practice, the model has not been adequately designed to directly account for both intracellular gene regulatory network interactions and spatial intercellular communications. Here, we propose a new RNA velocity approach that jointly and directly captures two new network structures: an intracellular gene regulatory network (GRN) and an intercellular interaction network that captures interactions between (neighboring) cells, with relevance to spatial transcriptomics. We theoretically analyze this two-level network system through the lens of control and consensus theory. In particular, we investigate network equilibria, stability, cellular network consensus, and optimal control approaches for targeted drug intervention.

  • Correcting Contextual Deletions in DNA Nanopore Readouts

    Open MIND · 2026-02-04

    preprint

    The problem of designing codes for deletion-correction and synchronization has received renewed interest due to applications in DNA-based data storage systems that use nanopore sequencers as readout platforms. In almost all instances, deletions are assumed to be imposed independently of each other and of the sequence context. These assumptions are not valid in practice, since nanopore errors tend to occur within specific contexts. We study contextual nanopore deletion-errors through the example setting of deterministic single deletions following (complete) runlengths of length at least $k$. The model critically depends on the runlength threshold $k$, and we examine two regimes for $k$: a) $k=C\log n$ for a constant $C\in(0,1)$; in this case, we study error-correcting codes that can protect from a constant number $t$ of contextual deletions, and show that the minimum redundancy (ignoring lower-order terms) is between $(1-C)t\log n$ and $2(1-C)t\log n$, meaning that it is a ($1-C$)-fraction of that of arbitrary $t$-deletion-correcting codes. To complement our non-constructive redundancy upper bound, we design efficiently and encodable and decodable codes for any constant $t$. In particular, for $t=1$ and $C>1/2$ we construct efficient codes with redundancy that essentially matches our non-constructive upper bound; b) $k$ equal a constant; in this case we consider the extremal problem where the number of deletions is not bounded and a deletion is imposed after every run of length at least $k$, which we call the extremal contextual deletion channel. This combinatorial setting arises naturally by considering a probabilistic channel that introduces contextual deletions after each run of length at least $k$ with probability $p$ and taking the limit $p\to 1$. We obtain sharp bounds on the maximum achievable rate under the extremal contextual deletion channel for arbitrary constant $k$.

  • Interplay between cohesin and RNA polymerase II in regulating chromatin interactions and gene transcription

    Nature Structural & Molecular Biology · 2026-01-13 · 3 citations

    article
  • Fragmentation in Data Deduplication Systems II: The Jump Metric

    2025-06-22

    articleSenior author

    Data deduplication refers to a collection of data processing strategies that aim to remove repeated data chunks stored by different users. Despite providing excellent storage savings, deduplication can lead to severe file fragmentation issues: data chunks of the same file may be stored at distal locations on the server, making reconstruction time-consuming. Here, we continue our analytical study of uncoded and coded deduplication methods with reduced fragmentation levels. We model files as self-avoiding (simple) paths in specialized graphs whose nodes correspond to data chunks. To measure the level of fragmentation, we introduce the jump metric which captures the worst-case number of times during the reconstruction process of a file that one has to change the readout location on the server. We derive lower and upper bounds on the degree of jump fragmentation, and provide a new algorithm for computing the jump number of hierarchical data structures captured by trees. We also present examples that show how repetition and coded redundancy in chunk stores can reduce jump fragmentation.

  • BPS2025 - Robust RNA tails detection technique using solid-state nanopores for high-density DNA data storage

    Biophysical Journal · 2025-02-01

    article
  • Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness

    ArXiv.org · 2025-06-06

    preprintOpen access

    Machine unlearning techniques aim to mitigate unintended memorization in large language models (LLMs). However, existing approaches predominantly focus on the explicit removal of isolated facts, often overlooking latent inferential dependencies and the non-deterministic nature of knowledge within LLMs. Consequently, facts presumed forgotten may persist implicitly through correlated information. To address these challenges, we propose a knowledge unlearning evaluation framework that more accurately captures the implicit structure of real-world knowledge by representing relevant factual contexts as knowledge graphs with associated confidence scores. We further develop an inference-based evaluation protocol leveraging powerful LLMs as judges; these judges reason over the extracted knowledge subgraph to determine unlearning success. Our LLM judges utilize carefully designed prompts and are calibrated against human evaluations to ensure their trustworthiness and stability. Extensive experiments on our newly constructed benchmark demonstrate that our framework provides a more realistic and rigorous assessment of unlearning performance. Moreover, our findings reveal that current evaluation strategies tend to overestimate unlearning effectiveness. Our code is publicly available at https://github.com/Graph-COM/Knowledge_Unlearning.git.

  • Generalized Orthogonal De Bruijn Sequences

    2025-06-22 · 1 citations

    articleSenior author

    A de Bruijn sequence of order <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$k$</tex> over a finite alphabet is a cyclic sequence with the property that it contains every possible <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$k$</tex>-sequence as a substring exactly once. Orthogonal de Bruijn sequences are collections of de Bruijn sequences of the same order, <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$k$</tex>, satisfying the joint constraint that every (<tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$k+1$</tex>) sequence appears as a substring in at most one of the sequences in the collection. Both de Bruijn and orthogonal de Bruijn sequences have found numerous applications in synthetic biology, although the latter topic remains largely unexplored in the coding theory literature. Here we study three relevant practical generalizations of orthogonal de Bruijn sequences where we relax either the constraint that every (<tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$k+1$</tex>) -sequence appears exactly once, or that the sequences themselves are de Bruijn rather than balanced de Bruijn sequences. We also provide lower and upper bounds on the number of fixed-weight orthogonal de Bruijn sequences.

  • Reducing Fragmentation in Data Deduplication Systems via Partial Repetition and Coding

    IEEE Transactions on Information Theory · 2025-09-23

    articleSenior author

    Data deduplication, one of the key features of modern Big Data storage devices, is the process of removing replicas of data chunks stored by different users. Despite the importance of deduplication, several drawbacks of the method, such as storage robustness and file fragmentation, have not been previously analyzed from a theoretical point of view. Storage robustness pertains to ensuring that deduplicated data can be used to reconstruct the original files without service disruptions and data loss. Fragmentation pertains to the problems of placing deduplicated data chunks of different user files in a proximity-preserving linear order, since neighboring chunks of the same file may be stored in sectors far apart on the server. This work proposes a new theoretical model for data fragmentation and introduces novel graph- and coding-theoretic approaches for reducing fragmentation via limited duplication (repetition coding) and coded deduplication (e.g., linear coding). In addition to alleviating issues with fragmentation, limited duplication and coded deduplication can also serve the dual purpose of increasing the robusteness of the system design. The contributions of our work are three-fold. First, we describe a new model for file structures of the form of self-avoiding (simple) paths in specialized graphs. Second, we introduce several new metrics for measuring the fragmentation level in deduplication systems on graph-structured files, including the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">stretch metric</i> that captures the worst-case “spread” of adjacent data chunks within a file when deduplicated and placed on the server; and, the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">jump metric</i> that captures the worst-case number of times during the reconstruction process of a file that one has to change the readout location on the server. For the stretch metric, we establish a connection between the level of fragmentation and the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">bandwidth</i> of the file-graph. In particular, we derive lower and upper bounds on the degree of fragmentation and describe instances of the problem where repetition and coding reduce fragmentation. The key ideas behind our approach are graph folding and information-theoretic arguments coupled with graph algorithms such as matching. For the jump metric, we provide a new algorithm for computing the jump number of hierarchical data structures captured by trees. Third, we describe how controlled repetition and coded redundancy added after deduplication can ensure valuable trade-offs between the storage volume and the degree of fragmentation.

Recent grants

Frequent coauthors

  • Ryan Gabrys

    University of California, San Diego

    75 shared
  • Farzad Farnoud

    42 shared
  • Eli Chien

    40 shared
  • Chao Pan

    Institute of Plasma Physics

    37 shared
  • Han Mao Kiah

    36 shared
  • Gregory J. Puleo

    32 shared
  • Son Hoang Dau

    29 shared
  • Jianhao Peng

    University of Illinois Urbana-Champaign

    27 shared

Education

  • Ph.D., Computer Science

    University of Belgrade

    1994
  • M.S., Computer Science

    University of Belgrade

    1991
  • B.S., Computer Science

    University of Belgrade

    1988

Awards & honors

  • NVMW Persistent Impact Prize, 2022, Portable and Error-Free…
  • ECE Alumni Distinguished Educator Award, University of Michi…
  • Canadian Workshop on Information Theory, Halifax, Canada, 20…
  • Learning and Signal Processing Methods for Emerging Multiomi…
  • Keynote speaker, 2023 Biennial Symposium on Communications,…
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Olgica Milenkovic

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup