Matteo Riondato
· Visiting Scientist in Computer ScienceVerifiedBrown University · Computer Science
Active 2010–2026
About
Matteo Riondato is an associate professor of computer science at Amherst College, where he leads the Data* Mammoths, a research and learning group composed of undergraduate students. He also serves as the founding director of the Data Science Initiative at Amherst. In addition to his role at Amherst College, he holds an appointment as visiting faculty in Computer Science at Brown University, where he advises PhD students. Prior to his academic positions, he worked as a research scientist in the Labs group at Two Sigma. His research focuses on algorithms for knowledge discovery, data mining, and machine learning. He develops theory and methods aimed at extracting the most information from large datasets as quickly as possible while maintaining statistical soundness. The problems he studies include pattern extraction, graph mining, and time series analysis. His algorithms often incorporate concepts from statistical learning theory and sampling. His research has received support from the National Science Foundation, including an NSF CAREER Award and other NSF awards. Matteo Riondato's academic lineage includes notable mathematicians such as Eli Upfal, Eli Shamir, Jacques Hadamard, Siméon Denis Poisson, and Pierre-Simon Laplace.
Research topics
- Computer science
- Algorithm
- Data mining
- Theoretical computer science
- Mathematics
Selected publications
DSP: A Statistically-Principled Structural Polarization Measure
2026-02-16
articleOpen accessSocial and information networks may become polarized, leading to echo chambers and political gridlock. Accurately measuring this phenomenon is a critical challenge. Existing measures often conflate genuine structural division with random topological features, yielding misleadingly high polarization scores on random networks, and failing to distinguish real-world networks from randomized null models. We introduce DSP, a Diffusion-based Structural Polarization measure designed from first principles to correct for such biases. DSP removes the arbitrary concept of 'influencers' used by the popular Random Walk Controversy (RWC) score, instead treating every node as a potential origin for a random walk. To validate our approach, we introduce a set of desirable properties for polarization measures, expressed through reference topologies with known structural properties. We show that DSP satisfies these desiderata, being near-zero for non-polarized structures such as cliques and random networks, while correctly capturing the expected polarization of reference topologies such as monochromatic-splittable networks. Our method applied to U.S. Congress datasets uncovers trends of increasing polarization in recent years. By integrating a null model into its core definition, DSP provides a reliable and interpretable diagnostic tool, highlighting the necessity of statistically-grounded metrics to analyze societal fragmentation.
Harvard Dataverse · 2026-01-20
datasetOpen access1st authorCorrespondingSee the README.md file.
2026-04-12
articleOpen accessSenior authorDSP: A Statistically-Principled Structural Polarization Measure
ArXiv.org · 2025-12-03
preprintOpen accessSocial and information networks may become polarized, leading to echo chambers and political gridlock. Accurately measuring this phenomenon is a critical challenge. Existing measures often conflate genuine structural division with random topological features, yielding misleadingly high polarization scores on random networks, and failing to distinguish real-world networks from randomized null models. We introduce DSP, a Diffusion-based Structural Polarization measure designed from first principles to correct for such biases. DSP removes the arbitrary concept of 'influencers' used by the popular Random Walk Controversy (RWC) score, instead treating every node as a potential origin for a random walk. To validate our approach, we introduce a set of desirable properties for polarization measures, expressed through reference topologies with known structural properties. We show that DSP satisfies these desiderata, being near-zero for non-polarized structures such as cliques and random networks, while correctly capturing the expected polarization of reference topologies such as monochromatic-splittable networks. Our method applied to U.S. Congress datasets uncovers trends of increasing polarization in recent years. By integrating a null model into its core definition, DSP provides a reliable and interpretable diagnostic tool, highlighting the necessity of statistically-grounded metrics to analyze societal fragmentation.
Harvard Dataverse · 2025-12-03
datasetOpen access1st authorCorrespondingSee the README.md file.
DiNgHy: Null Models for Non-degenerate Directed Hypergraphs
Lecture notes in computer science · 2025-10-03
article2025-02-26 · 2 citations
articleClaveNet: Generating Afro-Cuban Drum Patterns through Data Augmentation
2024-09-11
articleOpen accessSenior authorWe present ClaveNet: a generative MIDI model for Afro-Cuban percussion. We adapt the Monotonic Groove Transformer (MGT) —originally trained on the Groove MIDI Dataset (GMD)— to generate Afro-Cuban-influenced MIDI drum grooves. As Afro-Cuban drum MIDI data is scarce in the GMD and overall, we devise a data augmentation scheme to enrich MIDI percussion datasets with Afro-Cuban-inspired drum grooves by mixing examples with “seed patterns” rudimentary to Afro-Cuban percussion. To validate the effectiveness of our data augmentation algorithm at creating drum grooves infused with Afro-Cuban patterns, we trained MGT models on variants of the Groove MIDI Dataset augmented with our algorithm, and compared them to a baseline model trained on a non-augmented dataset. Our results show that MGT models trained with our augmented datasets are able to generate drum grooves whose rhythmic features are cumulatively closer to those from an evaluation set of real Afro-Cuban examples. We explore the effects of different hyperparameters to our system, discuss individual generated samples of selected models, and assess their faithfulness to Afro-Cuban styles. We hope this project fosters more research on developing music co-creation systems that encompass diverse musical styles outside those found in publicly available datasets.
Polaris: Sampling from the Multigraph Configuration Model with Prescribed Color Assortativity
arXiv (Cornell University) · 2024-09-02
preprintOpen accessWe introduce Polaris, a network null model for colored multi-graphs that preserves the Joint Color Matrix. Polaris is specifically designed for studying network polarization, where vertices belong to a side in a debate or a partisan group, represented by a vertex color, and relations have different strengths, represented by an integer-valued edge multiplicity. The key feature of Polaris is preserving the Joint Color Matrix (JCM) of the multigraph, which specifies the number of edges connecting vertices of any two given colors. The JCM is the basic property that determines color assortativity, a fundamental aspect in studying homophily and segregation in polarized networks. By using Polaris, network scientists can test whether a phenomenon is entirely explained by the JCM of the observed network or whether other phenomena might be at play. Technically, our null model is an extension of the configuration model: an ensemble of colored multigraphs characterized by the same degree sequence and the same JCM. To sample from this ensemble, we develop a suite of Markov Chain Monte Carlo algorithms, collectively named Polaris-*. It includes Polaris-B, an adaptation of a generic Metropolis-Hastings algorithm, and Polaris-C, a faster, specialized algorithm with higher acceptance probabilities. This new null model and the associated algorithms provide a more nuanced toolset for examining polarization in social networks, thus enabling statistically sound conclusions.
Physical review. E · 2024-05-13 · 2 citations
articleSenior authorMarkov Chain Monte Carlo (MCMC) algorithms are commonly used to sample from graph ensembles. Two graphs are neighbors in the state space if one can be obtained from the other with only a few modifications, e.g., edge rewirings. For many common ensembles, e.g., those preserving the degree sequences of bipartite graphs, rewiring operations involving two edges are sufficient to create a fully connected state space, and they can be performed efficiently. We show that, for ensembles of bipartite graphs with fixed degree sequences and number of butterflies (k_{2,2} bicliques), there is no universal constant c such that a rewiring of at most c edges at every step is sufficient for any such ensemble to be fully connected. Our proof relies on an explicit construction of a family of pairs of graphs with the same degree sequences and number of butterflies, with each pair indexed by a natural c, and such that any sequence of rewiring operations transforming one graph into the other must include at least one rewiring operation involving at least c edges. Whether rewiring this many edges is sufficient to guarantee the full connectivity of the state space of any such ensemble remains an open question. Our result implies the impossibility of developing efficient, graph-agnostic, MCMC algorithms for these ensembles, as the necessity to rewire an impractically large number of edges may hinder taking a step on the state space.
Recent grants
CAREER: Statistically-Sound Knowledge Discovery from Data
NSF · $483k · 2023–2028
NSF · $373k · 2020–2024
Frequent coauthors
- 61 shared
Eli Upfal
- 14 shared
Mert Akdere
Brown University
- 14 shared
Fabio Vandin
University of Padua
- 14 shared
Uğur Çetintemel
- 12 shared
Stanley B. Zdonik
Brown University
- 10 shared
Cyrus Cousins
- 10 shared
Gianmarco De Francisci Morales
- 8 shared
Giulia Preti
Centre d'Imagerie BioMedicale
Labs
Research on Algorithms for Knowledge Discovery, Data Mining, and Machine Learning
Education
- 2014
Ph.D., Computer Science
Brown University
- 2010
Sc.M., Computer Science
Brown University
- 2009
Laurea Specialistica (M.Sc.), Information Engineering
Università degli Studi di Padova
- 2007
Laurea (B.Sc.), Information Engineering
Università degli Studi di Padova
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Matteo Riondato
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup