Matteo Riondato

· Visiting Scientist in Computer ScienceVerified

Brown University · Computer Science

Active 2010–2026

h-index24

Citations2.1k

Papers7723 last 5y

Funding$857k1 active

Faculty page Lab page

See your match with Matteo Riondato — sign in to PhdFit.Sign in

About

Matteo Riondato is an associate professor of computer science at Amherst College, where he leads the Data* Mammoths, a research and learning group composed of undergraduate students. He also serves as the founding director of the Data Science Initiative at Amherst. In addition to his role at Amherst College, he holds an appointment as visiting faculty in Computer Science at Brown University, where he advises PhD students. Prior to his academic positions, he worked as a research scientist in the Labs group at Two Sigma. His research focuses on algorithms for knowledge discovery, data mining, and machine learning. He develops theory and methods aimed at extracting the most information from large datasets as quickly as possible while maintaining statistical soundness. The problems he studies include pattern extraction, graph mining, and time series analysis. His algorithms often incorporate concepts from statistical learning theory and sampling. His research has received support from the National Science Foundation, including an NSF CAREER Award and other NSF awards. Matteo Riondato's academic lineage includes notable mathematicians such as Eli Upfal, Eli Shamir, Jacques Hadamard, Siméon Denis Poisson, and Pierre-Simon Laplace.

Research topics

Computer science
Algorithm
Data mining
Theoretical computer science
Mathematics

Selected publications

DSP: A Statistically-Principled Structural Polarization Measure
2026-02-16
articleOpen access
Social and information networks may become polarized, leading to echo chambers and political gridlock. Accurately measuring this phenomenon is a critical challenge. Existing measures often conflate genuine structural division with random topological features, yielding misleadingly high polarization scores on random networks, and failing to distinguish real-world networks from randomized null models. We introduce DSP, a Diffusion-based Structural Polarization measure designed from first principles to correct for such biases. DSP removes the arbitrary concept of 'influencers' used by the popular Random Walk Controversy (RWC) score, instead treating every node as a potential origin for a random walk. To validate our approach, we introduce a set of desirable properties for polarization measures, expressed through reference topologies with known structural properties. We show that DSP satisfies these desiderata, being near-zero for non-polarized structures such as cliques and random networks, while correctly capturing the expected polarization of reference topologies such as monochromatic-splittable networks. Our method applied to U.S. Congress datasets uncovers trends of increasing polarization in recent years. By integrating a null model into its core definition, DSP provides a reliable and interpretable diagnostic tool, highlighting the necessity of statistically-grounded metrics to analyze societal fragmentation.
Publisher DOI
Source Code and Replication Data for: HomeRun: Performing Curveball Trades quasi in Streaming for Fast Null Modeling of Graphs, Hypergraphs, and Binary Matrices
Harvard Dataverse · 2026-01-20
datasetOpen access1st authorCorresponding
See the README.md file.
Publisher DOI
HomeRun: Performing Curveball Trades quasi in Streaming for Fast Null Modeling of Graphs, Hypergraphs, and Binary Matrices
2026-04-12
articleOpen accessSenior author
Publisher DOI
DSP: A Statistically-Principled Structural Polarization Measure
ArXiv.org · 2025-12-03
preprintOpen access
Social and information networks may become polarized, leading to echo chambers and political gridlock. Accurately measuring this phenomenon is a critical challenge. Existing measures often conflate genuine structural division with random topological features, yielding misleadingly high polarization scores on random networks, and failing to distinguish real-world networks from randomized null models. We introduce DSP, a Diffusion-based Structural Polarization measure designed from first principles to correct for such biases. DSP removes the arbitrary concept of 'influencers' used by the popular Random Walk Controversy (RWC) score, instead treating every node as a potential origin for a random walk. To validate our approach, we introduce a set of desirable properties for polarization measures, expressed through reference topologies with known structural properties. We show that DSP satisfies these desiderata, being near-zero for non-polarized structures such as cliques and random networks, while correctly capturing the expected polarization of reference topologies such as monochromatic-splittable networks. Our method applied to U.S. Congress datasets uncovers trends of increasing polarization in recent years. By integrating a null model into its core definition, DSP provides a reliable and interpretable diagnostic tool, highlighting the necessity of statistically-grounded metrics to analyze societal fragmentation.
Publisher OA PDF DOI
Source Code and Replication Data for: VaLUH: Fast Algorithms for the Configuration Model of Vertex-Labeled Undirected Hypergraphs
Harvard Dataverse · 2025-12-03
datasetOpen access1st authorCorresponding
See the README.md file.
Publisher DOI
DiNgHy: Null Models for Non-degenerate Directed Hypergraphs
Lecture notes in computer science · 2025-10-03
article
Publisher DOI
<scp>Polaris:</scp> Sampling from the Multigraph Configuration Model with Prescribed Color Assortativity
2025-02-26 · 2 citations
article
Publisher DOI
ClaveNet: Generating Afro-Cuban Drum Patterns through Data Augmentation
2024-09-11
articleOpen accessSenior author
We present ClaveNet: a generative MIDI model for Afro-Cuban percussion. We adapt the Monotonic Groove Transformer (MGT) —originally trained on the Groove MIDI Dataset (GMD)— to generate Afro-Cuban-influenced MIDI drum grooves. As Afro-Cuban drum MIDI data is scarce in the GMD and overall, we devise a data augmentation scheme to enrich MIDI percussion datasets with Afro-Cuban-inspired drum grooves by mixing examples with “seed patterns” rudimentary to Afro-Cuban percussion. To validate the effectiveness of our data augmentation algorithm at creating drum grooves infused with Afro-Cuban patterns, we trained MGT models on variants of the Groove MIDI Dataset augmented with our algorithm, and compared them to a baseline model trained on a non-augmented dataset. Our results show that MGT models trained with our augmented datasets are able to generate drum grooves whose rhythmic features are cumulatively closer to those from an evaluation set of real Afro-Cuban examples. We explore the effects of different hyperparameters to our system, discuss individual generated samples of selected models, and assess their faithfulness to Afro-Cuban styles. We hope this project fosters more research on developing music co-creation systems that encompass diverse musical styles outside those found in publicly available datasets.
Publisher DOI
Polaris: Sampling from the Multigraph Configuration Model with Prescribed Color Assortativity
arXiv (Cornell University) · 2024-09-02
preprintOpen access
We introduce Polaris, a network null model for colored multi-graphs that preserves the Joint Color Matrix. Polaris is specifically designed for studying network polarization, where vertices belong to a side in a debate or a partisan group, represented by a vertex color, and relations have different strengths, represented by an integer-valued edge multiplicity. The key feature of Polaris is preserving the Joint Color Matrix (JCM) of the multigraph, which specifies the number of edges connecting vertices of any two given colors. The JCM is the basic property that determines color assortativity, a fundamental aspect in studying homophily and segregation in polarized networks. By using Polaris, network scientists can test whether a phenomenon is entirely explained by the JCM of the observed network or whether other phenomena might be at play. Technically, our null model is an extension of the configuration model: an ensemble of colored multigraphs characterized by the same degree sequence and the same JCM. To sample from this ensemble, we develop a suite of Markov Chain Monte Carlo algorithms, collectively named Polaris-*. It includes Polaris-B, an adaptation of a generic Metropolis-Hastings algorithm, and Polaris-C, a faster, specialized algorithm with higher acceptance probabilities. This new null model and the associated algorithms provide a more nuanced toolset for examining polarization in social networks, thus enabling statistically sound conclusions.
Publisher OA PDF DOI
Impossibility result for Markov chain Monte Carlo sampling from microcanonical bipartite graph ensembles
Physical review. E · 2024-05-13 · 2 citations
articleSenior author
Markov Chain Monte Carlo (MCMC) algorithms are commonly used to sample from graph ensembles. Two graphs are neighbors in the state space if one can be obtained from the other with only a few modifications, e.g., edge rewirings. For many common ensembles, e.g., those preserving the degree sequences of bipartite graphs, rewiring operations involving two edges are sufficient to create a fully connected state space, and they can be performed efficiently. We show that, for ensembles of bipartite graphs with fixed degree sequences and number of butterflies (k_{2,2} bicliques), there is no universal constant c such that a rewiring of at most c edges at every step is sufficient for any such ensemble to be fully connected. Our proof relies on an explicit construction of a family of pairs of graphs with the same degree sequences and number of butterflies, with each pair indexed by a natural c, and such that any sequence of rewiring operations transforming one graph into the other must include at least one rewiring operation involving at least c edges. Whether rewiring this many edges is sufficient to guarantee the full connectivity of the state space of any such ensemble remains an open question. Our result implies the impossibility of developing efficient, graph-agnostic, MCMC algorithms for these ensembles, as the necessity to rewire an impractically large number of edges may hinder taking a step on the state space.
Publisher DOI

Recent grants

CAREER: Statistically-Sound Knowledge Discovery from Data
NSF · $483k · 2023–2028
III: Small: RUI: Scalable and Iterative Statistical Testing of Multiple Hypotheses on Massive Datasets
NSF · $373k · 2020–2024

Frequent coauthors

Eli Upfal
61 shared
Mert Akdere
Brown University
14 shared
Fabio Vandin
University of Padua
14 shared
Uğur Çetintemel
14 shared
Stanley B. Zdonik
Brown University
12 shared
Cyrus Cousins
10 shared
Gianmarco De Francisci Morales
10 shared
Giulia Preti
Centre d'Imagerie BioMedicale
8 shared

Labs

Data* MammothsPI
Research on Algorithms for Knowledge Discovery, Data Mining, and Machine Learning

Education

Ph.D., Computer Science
Brown University
2014
Sc.M., Computer Science
Brown University
2010
Laurea Specialistica (M.Sc.), Information Engineering
Università degli Studi di Padova
2009
Laurea (B.Sc.), Information Engineering
Università degli Studi di Padova
2007

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Matteo Riondato

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you