Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…

Erin Conlon

· Professor; Director, Stat MS Graduate Admissions (Charles River Campus, Newton)Verified

University of Massachusetts Amherst · Mathematics and Statistics

Active 1996–2021

h-index17
Citations2.7k
Papers402 last 5y
Funding$81k
See your match with Erin Conlon — sign in to PhdFit.Sign in

About

Erin Conlon is a professor in the Department of Mathematics and Statistics at the University of Massachusetts Amherst, specifically affiliated with the Newton Mount Ida Campus in the Boston area. She is involved in the Statistics Graduate Program, which offers a completely flexible Master's degree in Statistics with options for in-person evening classes and remote learning. The program is designed to accommodate students through flexible learning modalities, and the degree is awarded by the University of Massachusetts Amherst. Additionally, she is connected to the Boston-Area Data Science Certificate, a joint offering between Statistics and Computer Science that can be completed fully online. Professor Conlon's educational background includes a Ph.D. in Biostatistics and an M.S. in Biostatistics from the University of Minnesota, as well as a B.S. in Mathematics from the University of Wisconsin, Madison. Her professional history includes postdoctoral fellowships in the Department of Statistics at Harvard University and in Statistical Genetics at the University of Washington, Seattle, as well as a visiting scholar position in Functional Genomics at the Institute for Pure and Applied Mathematics at UCLA. Her research focuses on developing Bayesian statistical methods for data science, big data, and analytics. She collaborates with researchers such as Xiaojing Wang, Zheng Wei, and Alexey Miroshnikov in this area. Her interests also extend to statistical methods in genomics and bioinformatics, including gene expression and DNA sequence analysis, Bayesian models for genomic data, and comparative genomics. Current work includes systems-biology approaches to studying regulatory and metabolic networks of microbes, in collaboration with Kristen DeAngelis's lab, and statistical and bioinformatic methods for breast cancer gene expression studies with Joseph Jerry's lab. Other collaborative projects involve microbial organisms such as Prochlorococcus marinus, Geobacter, and Bacillus subtilis, working with researchers Jeffrey Blanchard, Derek Lovley, and Richard Losick. She has also developed software, including the R package parallelMCMCcombine, which supports Bayesian methods for big data and analytics.

Research topics

  • Computer Science
  • Mathematics
  • Statistics
  • Data Mining
  • Machine Learning
  • Artificial Intelligence
  • Econometrics
  • Algorithm
  • Applied mathematics
  • Economics

Selected publications

  • A Bayesian approach to the analysis of asymmetric association for two-way contingency tables

    Computational Statistics · 2021 · 1 citations

    Senior authorCorresponding
    • Computer Science
    • Data Mining
    • Computer Science
  • Asymmetric dependence in the stochastic frontier model using skew normal copula

    International Journal of Approximate Reasoning · 2020 · 13 citations

    • Computer Science
    • Econometrics
    • Mathematics
  • Gene expression signature of atypical breast hyperplasia and regulation by SFRP1

    Breast Cancer Research · 2019-06-27 · 27 citations

    articleOpen access

    BACKGROUND: Atypical breast hyperplasias (AH) have a 10-year risk of progression to invasive cancer estimated at 4-7%, with the overall risk of developing breast cancer increased by ~ 4-fold. AH lesions are estrogen receptor alpha positive (ERα+) and represent risk indicators and/or precursor lesions to low grade ERα+ tumors. Therefore, molecular profiles of AH lesions offer insights into the earliest changes in the breast epithelium, rendering it susceptible to oncogenic transformation. METHODS: In this study, women were selected who were diagnosed with ductal or lobular AH, but no breast cancer prior to or within the 2-year follow-up. Paired AH and histologically normal benign (HNB) tissues from patients were microdissected. RNA was isolated, amplified linearly, labeled, and hybridized to whole transcriptome microarrays to determine gene expression profiles. Genes that were differentially expressed between AH and HNB were identified using a paired analysis. Gene expression signatures distinguishing AH and HNB were defined using AGNES and PAM methods. Regulation of gene networks was investigated using breast epithelial cell lines, explant cultures of normal breast tissue and mouse tissues. RESULTS: A 99-gene signature discriminated the histologically normal and AH tissues in 81% of the cases. Network analysis identified coordinated alterations in signaling through ERα, epidermal growth factor receptors, and androgen receptor which were associated with the development of both lobular and ductal AH. Decreased expression of SFRP1 was also consistently lower in AH. Knockdown of SFRP1 in 76N-Tert cells resulted altered expression of 13 genes similarly to that observed in AH. An SFRP1-regulated network was also observed in tissues from mice lacking Sfrp1. Re-expression of SFRP1 in MCF7 cells provided further support for the SFRP1-regulated network. Treatment of breast explant cultures with rSFRP1 dampened estrogen-induced progesterone receptor levels. CONCLUSIONS: The alterations in gene expression were observed in both ductal and lobular AH suggesting shared underlying mechanisms predisposing to AH. Loss of SFRP1 expression is a significant regulator of AH transcriptional profiles driving previously unidentified changes affecting responses to estrogen and possibly other pathways. The gene signature and pathways provide insights into alterations contributing to AH breast lesions.

  • Additional file 2: of Gene expression signature of atypical breast hyperplasia and regulation by SFRP1

    Figshare · 2019-01-01

    datasetOpen access

    Table S1. Probesets that are differentially expressed (1039 probesets). Table S2. Probesets selected by pâ <â 0.005 used for hierarchical clustering by AGNES (99 genes). Table S3. Probesets selected by PAM (139 genes). Table S4. Zero-order gene network. Table S5. Primers for RT-qPCR. (XLSX 204 kb)

  • Parallel Markov chain Monte Carlo for Bayesian hierarchical models with big data, in two stages

    Journal of Applied Statistics · 2019-01-29 · 4 citations

    articleSenior authorCorresponding

    Due to the escalating growth of big data sets in recent years, new Bayesian Markov chain Monte Carlo (MCMC) parallel computing methods have been developed. These methods partition large data sets by observations into subsets. However, for Bayesian nested hierarchical models, typically only a few parameters are common for the full data set, with most parameters being group specific. Thus, parallel Bayesian MCMC methods that take into account the structure of the model and split the full data set by groups rather than by observations are a more natural approach for analysis. Here, we adapt and extend a recently introduced two-stage Bayesian hierarchical modeling approach, and we partition complete data sets by groups. In stage 1, the group-specific parameters are estimated independently in parallel. The stage 1 posteriors are used as proposal distributions in stage 2, where the target distribution is the full model. Using three-level and four-level models, we show in both simulation and real data studies that results of our method agree closely with the full data analysis, with greatly increased MCMC efficiency and greatly reduced computation times. The advantages of our method versus existing parallel MCMC computing methods are also described.

  • Genome Sequence of <i>Verrucomicrobium</i> sp. Strain GAS474, a Novel Bacterium Isolated from Soil

    Genome Announcements · 2018-01-24 · 11 citations

    articleOpen access

    sp. strain GAS474 was isolated from the mineral soil of a temperate deciduous forest in central Massachusetts. Here, we present the complete genome sequence of this phylogenetically novel organism, which consists of a total of 3,763,444 bp on a single scaffold, with a 65.8% GC content and 3,273 predicted open reading frames.

  • Parallel Markov chain Monte Carlo for Bayesian dynamic item response models in educational testing

    Stat · 2017-01-01 · 3 citations

    articleSenior authorCorresponding

    Bayesian dynamic item response models have been successfully used for educational testing data; these models are especially useful for individually varying and irregularly spaced longitudinal testing data. However, because of the complexity of the models and the large size of the data sets, computation time is excessive for carrying out full data analyses in practice. Here, we introduce a parallel Markov chain Monte Carlo method to speed the implementation of these Bayesian models. Using both simulation data and real educational testing data for reading ability, we demonstrate that computation time is greatly reduced for our parallel computing method versus full data analyses. The estimated error of our method is shown to be small, using common distance metrics. Our parallel computing approach can be used for other models in the Educational and Psychometric fields, including Bayesian item response theory models. Copyright © 2017 John Wiley &amp; Sons, Ltd.

  • Asymptotic properties and approximation of Bayesian logspline density estimators for communication-free parallel computing methods

    arXiv (Cornell University) · 2017-10-25

    preprintOpen accessSenior author

    In this article we perform an asymptotic analysis of parallel Bayesian logspline density estimators. Such estimators are useful for the analysis of datasets that are partitioned into subsets and stored in separate databases without the capability of accessing the full dataset from a single computer. The parallel estimator we introduce is in the spirit of a kernel density estimator introduced in recent studies. We provide a numerical procedure that produces the normalized density estimator itself in place of the sampling algorithm. We then derive an error bound for the mean integrated squared error of the full dataset posterior estimator. The error bound depends upon the parameters that arise in logspline density estimation and the numerical approximation procedure. In our analysis, we identify the choices for the parameters that result in the error bound scaling optimally in relation to the number of samples. This provides our method with increased estimation accuracy, while also minimizing the computational cost.

  • Asymptotic properties and approximation of Bayesian logspline density estimators for communication-free parallel computing methods

    arXiv (Cornell University) · 2017-10-25

    preprintOpen accessSenior author

    In this article we perform an asymptotic analysis of Bayesian parallel density estimators which are based on logspline density estimation. The parallel estimator we introduce is in the spirit of a kernel density estimator introduced in recent studies. We provide a numerical procedure that produces the density estimator itself in place of the sampling algorithm. We then derive an error bound for the mean integrated squared error for the full data posterior density estimator. We also investigate the parameters that arise from logspline density estimation and the numerical approximation procedure. Our investigation identifies specific choices of parameters for logspline density estimation that result in the error bound scaling appropriately in relation to these choices.

  • Parallel Markov Chain Monte Carlo for Bayesian Hierarchical Models with Big Data, in Two Stages

    arXiv (Cornell University) · 2017-12-16

    preprintOpen accessSenior author

    Due to the escalating growth of big data sets in recent years, new Bayesian Markov chain Monte Carlo (MCMC) parallel computing methods have been developed. These methods partition large data sets by observations into subsets. However, for Bayesian nested hierarchical models, typically only a few parameters are common for the full data set, with most parameters being group-specific. Thus, parallel Bayesian MCMC methods that take into account the structure of the model and split the full data set by groups rather than by observations are a more natural approach for analysis. Here, we adapt and extend a recently introduced two-stage Bayesian hierarchical modeling approach, and we partition complete data sets by groups. In stage 1, the group-specific parameters are estimated independently in parallel. The stage 1 posteriors are used as proposal distributions in stage 2, where the target distribution is the full model. Using three-level and four-level models, we show in both simulation and real data studies that results of our method agree closely with the full data analysis, with greatly increased MCMC efficiency and greatly reduced computation times. The advantages of our method versus existing parallel MCMC computing methods are also described.

Recent grants

Frequent coauthors

  • Jun S. Liu

    13 shared
  • Jason D. Lieb

    University of Chicago

    9 shared
  • X. Shirley Liu

    G1 Therapeutics (United States)

    9 shared
  • Zheng Wei

    Texas A&M University – Corpus Christi

    6 shared
  • Patrick Eichenberger

    New York University

    5 shared
  • Alexey Miroshnikov

    Discover Financial Services (United States)

    5 shared
  • Ellen M. Wijsman

    University of Washington

    3 shared
  • Richard Losick

    Harvard University

    3 shared

Labs

  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Erin Conlon

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup