Murali Haran

· Professor of Statistics, Graduate Faculty, Social Data AnalyticsVerified

Pennsylvania State University · Social Data Analytics

Active 2001–2025

h-index31

Citations4.1k

Papers14730 last 5y

Funding$501k

Faculty page Lab page Website

See your match with Murali Haran — sign in to PhdFit.Sign in

About

Murali Haran is a Professor of Statistics and a Graduate Faculty member in Social Data Analytics at Pennsylvania State University. He is affiliated with the departments of Social Data Analytics and Statistics, and his office is located at 302 Pond Laboratory, University Park, PA 16802. His research focuses on social data analytics, applying statistical methods to analyze social data, although specific details of his research interests are not provided on the page. As a faculty member, he contributes to the academic community through teaching and research, and he maintains a professional webpage and a Google Scholar profile for further information about his scholarly work.

Research topics

Sociology
Econometrics
Psychology
Demography
Geography
Medicine
Mathematics
Statistics
Virology
Surgery
Biology
Social psychology
Microbiology

Selected publications

Probabilistic Downscaling for Flood Hazard Models
ArXiv.org · 2025-03-26
preprintOpen accessSenior author
Riverine flooding poses significant risks. Developing strategies to manage flood risks requires flood projections with decision-relevant scales and well-characterized uncertainties, often at high spatial resolutions. However, calibrating high-resolution flood models can be computationally prohibitive. To address this challenge, we propose a probabilistic downscaling approach that maps low-resolution model projections onto higher-resolution grids. The existing literature presents two distinct types of downscaling approaches: (1) probabilistic methods, which are versatile and applicable across various physics-based models, and (2) deterministic downscaling methods, specifically tailored for flood hazard models. Both types of downscaling approaches come with their own set of mutually exclusive advantages. Here we introduce a new approach, PDFlood, that combines the advantages of existing probabilistic and flood model-specific downscaling approaches, mainly (1) spatial flooding probabilities and (2) improved accuracy from approximating physical processes. Compared to the state of the art deterministic downscaling approach for flood hazard models, PDFlood allows users to consider previously neglected uncertainties while providing comparable accuracy, thereby better informing the design of risk management strategies. While we develop PDFlood for flood models, the general concepts translate to other applications such as wildfire models.
Publisher OA PDF DOI
A class of models for large zero-inflated spatial data
Journal of Agricultural Biological and Environmental Statistics · 2024-04-29 · 3 citations
articleOpen accessSenior author
Abstract Spatially correlated data with an excess of zeros, usually referred to as zero-inflated spatial data, arise in many disciplines. Examples include count data, for instance, abundance (or lack thereof) of animal species and disease counts, as well as semi-continuous data like observed precipitation. Spatial two-part models are a flexible class of models for such data. Fitting two-part models can be computationally expensive for large data due to high-dimensional dependent latent variables, costly matrix operations, and slow mixing Markov chains. We describe a flexible, computationally efficient approach for modeling large zero-inflated spatial data using the projection-based intrinsic conditional autoregression (PICAR) framework. We study our approach, which we call PICAR-Z, through extensive simulation studies and two environmental data sets. Our results suggest that PICAR-Z provides accurate predictions while remaining computationally efficient. An important goal of our work is to allow researchers who are not experts in computation to easily build computationally efficient extensions to zero-inflated spatial models; this also allows for a more thorough exploration of modeling choices in two-part models than was previously possible. We show that PICAR-Z is easy to implement and extend in popular probabilistic programming languages such as and .
Publisher OA PDF DOI
Computer Model Calibration Based on Image Warping Metrics: An Application for Sea Ice Deformation
UNC Libraries · 2024-09-06
articleOpen access
Publisher DOI
Fast Bayesian Inference for Spatial Mean-Parameterized Conway–Maxwell–Poisson Models
Journal of Computational and Graphical Statistics · 2024-08-21 · 3 citations
articleOpen accessSenior author
Count data with complex features arise in many disciplines, including ecology, agriculture, criminology, medicine, and public health. Zero inflation, spatial dependence, and non-equidispersion are common features in count data. There are currently two classes of models that allow for these features-the mode-parameterized Conway-Maxwell-Poisson (COMP) distribution and the generalized Poisson model. However both require the use of either constraints on the parameter space or a parameterization that leads to challenges in interpretability. We propose spatial mean-parameterized COMP models that retain the flexibility of these models while resolving the above issues. We use a Bayesian spatial filtering approach in order to efficiently handle high-dimensional spatial data and we use reversible-jump MCMC to automatically choose the basis vectors for spatial filtering. The COMP distribution poses two additional computational challenges-an intractable normalizing function in the likelihood and no closed-form expression for the mean. We propose a fast computational approach that addresses these challenges by, respectively, introducing an efficient auxiliary variable algorithm and pre-computing key approximations for fast likelihood evaluation. We illustrate the application of our methodology to simulated and real datasets, including Texas HPV-cancer data and US vaccine refusal data. Supplementary materials for this article are available online.
Publisher OA PDF DOI
Spatial distribution and determinants of childhood vaccination refusal in the United States
Vaccine · 2023-04-15 · 4 citations
articleOpen accessSenior author
Publisher OA PDF DOI
Fast Bayesian inference for spatial mean-parameterized Conway-Maxwell-Poisson models
arXiv (Cornell University) · 2023-01-27
preprintOpen accessSenior author
Count data with complex features arise in many disciplines, including ecology, agriculture, criminology, medicine, and public health. Zero inflation, spatial dependence, and non-equidispersion are common features in count data. There are two classes of models that allow for these features -- he mode-parameterized Conway--Maxwell--Poisson (COMP) distribution and the generalized Poisson model. However both require the use of either constraints on the parameter space or a parameterization that leads to challenges in interpretability. We propose a spatial mean-parameterized COMP model that retains the flexibility of these models while resolving the above issues. We use a Bayesian spatial filtering approach in order to efficiently handle high-dimensional spatial data and we use reversible-jump MCMC to automatically choose the basis vectors for spatial filtering. The COMP distribution poses two additional computational challenges -- an intractable normalizing function in the likelihood and no closed-form expression for the mean. We propose a fast computational approach that addresses these challenges by, respectively, introducing an efficient auxiliary variable algorithm and pre-computing key approximations for fast likelihood evaluation. We illustrate the application of our methodology to simulated and real datasets, including Texas HPV-cancer data and US vaccine refusal data.
Publisher OA PDF DOI
Bayesian Spatial Models for Projecting Corn Yields
Remote Sensing · 2023-12-23 · 2 citations
articleOpen accessSenior authorCorresponding
Climate change is predicted to impact corn yields. Previous studies analyzing these impacts differ in data and modeling approaches and, consequently, corn yield projections. We analyze the impacts of climate change on corn yields using two statistical models with different approaches for dealing with county-level effects. The first model, which is novel to modeling corn yields, uses a computationally efficient spatial basis function approach. We use a Bayesian framework to incorporate both parametric and climate model structural uncertainty. We find that the statistical models have similar predictive abilities, but the spatial basis function model is faster and hence potentially a useful tool for crop yield projections. We also explore how different gridded temperature datasets affect the statistical model fit and performance. Compared to the dataset with only weather station data, we find that the dataset composed of satellite and weather station data results in a model with a magnified relationship between temperature and corn yields. For all statistical models, we observe a relationship between temperature and corn yields that is broadly similar to previous studies. We use downscaled and bias-corrected CMIP5 climate model projections to obtain detrended corn yield projections for 2020–2049 and 2069–2098. In both periods, we project a decrease in the mean corn yield production, reinforcing the findings of other studies. However, the magnitude of the decrease and the associated uncertainties we obtain differ from previous studies.
Publisher OA PDF DOI
A Class of Models for Large Zero-inflated Spatial Data
arXiv (Cornell University) · 2023-04-05
preprintOpen accessSenior author
Spatially correlated data with an excess of zeros, usually referred to as zero-inflated spatial data, arise in many disciplines. Examples include count data, for instance, abundance (or lack thereof) of animal species and disease counts, as well as semi-continuous data like observed precipitation. Spatial two-part models are a flexible class of models for such data. Fitting two-part models can be computationally expensive for large data due to high-dimensional dependent latent variables, costly matrix operations, and slow mixing Markov chains. We describe a flexible, computationally efficient approach for modeling large zero-inflated spatial data using the projection-based intrinsic conditional autoregression (PICAR) framework. We study our approach, which we call PICAR-Z, through extensive simulation studies and two environmental data sets. Our results suggest that PICAR-Z provides accurate predictions while remaining computationally efficient. An important goal of our work is to allow researchers who are not experts in computation to easily build computationally efficient extensions to zero-inflated spatial models; this also allows for a more thorough exploration of modeling choices in two-part models than was previously possible. We show that PICAR-Z is easy to implement and extend in popular probabilistic programming languages such as nimble and stan.
Publisher OA PDF DOI
A Shared Component Point Process Model for Urban Policing
arXiv (Cornell University) · 2023-03-01
preprintOpen accessSenior author
Newly available point-level datasets allow us to relate police use of force to other events describing police behavior. Current methods for relating two point processes typically rely on the spatial aggregation of one of the two point processes. We investigate new methods that build upon shared component models and case-control methods to retain the point-level nature of both point processes while characterizing the relationship between them. We find that the shared component approach is particularly useful in flexibly relating two point processes, and we illustrate this flexibility in simulated examples and an application to Chicago policing data.
Publisher OA PDF DOI
Neglecting Model Parametric Uncertainty Can Drastically Underestimate Flood Risks
Earth s Future · 2022-12-05 · 7 citations
articleOpen access
Abstract Floods drive dynamic and deeply uncertain risks for people and infrastructures. Uncertainty characterization is a crucial step in improving the predictive understanding of multi‐sector dynamics and the design of risk‐management strategies. Current approaches to estimate flood hazards often sample only a relatively small subset of the known unknowns, for example, the uncertainties surrounding the model parameters. This approach neglects the impacts of key uncertainties on hazards and system dynamics. Here we mainstream a recently developed method for Bayesian inference to calibrate a computationally expensive distributed hydrologic model. We compare three different calibration approaches: (a) stepwise line search, (b) precalibration or screening, and (c) the Fast Model Calibrations (FaMoS) approach. FaMoS deploys a particle‐based approach that takes advantage of the massive parallelization afforded by modern high‐performance computing systems. We quantify how neglecting parametric uncertainty and data discrepancy can drastically underestimate extreme flood events and risks. Precalibration improves prediction skill score over a stepwise line search. The Bayesian calibration improves the uncertainty characterization of model parameters and flood risk projections.
Publisher OA PDF DOI

Recent grants

Statistical Methods for Ice Sheet Projections using Large Non-Gaussian Space-Time Data Sets and Complex Computer Models
NSF · $501k · 2014–2018

Frequent coauthors

Klaus Keller
Dartmouth College
37 shared
Won Chang
Seoul National University
26 shared
Ben Seiyon Lee
George Mason University
25 shared
Galin L. Jones
20 shared
David Pollard
Pennsylvania State University
17 shared
Sanjib Sharma
Howard University
16 shared
Iman Hosseini‐Shakib
Pennsylvania State University
15 shared
Matthew J. Ferrari
Pennsylvania State University
14 shared

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Murali Haran

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you