Xiao Hui Tai
· Assistant ProfessorVerifiedUniversity of California, Davis · Statistics
Active 2017–2025
About
Xiao Hui Tai, Ph.D., is an Assistant Professor in the Department of Statistics at UC Davis. She earned her Ph.D. in Statistics from Carnegie Mellon University in 2019. Her research focuses on the use of non-traditional sources of data to study problems in data-scarce settings, with particular emphasis on conflict, crime, and issues in the developing world. Her work explores how various data sources can inform social science research and address real-world challenges, including the impacts of armed conflict on education, environmental health effects such as pollution exposure, mental health and emotional wellbeing through social media discourse, and the effects of violence on internal displacement. Dr. Tai has contributed to understanding complex social phenomena through innovative statistical approaches and has published extensively on these topics.
Research topics
- Computer Science
- Political Science
- Medicine
- Computer Security
- Artificial Intelligence
- Psychology
- Criminology
- Economics
- Data science
- Econometrics
- Environmental health
- Telecommunications
- Engineering
- Geography
- Law
Selected publications
Significance · 2025-02-04
article1st authorCorrespondingAbstract The devastating consequences of violent conflict are hard to establish due to insufficient data. But new data sources, such as that from mobile phones, can help the international community build a clearer picture, as Xiao Hui Tai explains
The American Statistician · 2025-04-09
article1st authorCorrespondingNearby armed conflict affects girls’ education in Africa
PLoS ONE · 2025-01-15 · 1 citations
articleOpen access1st authorCorrespondingFemale education is a crucial input to women's agency and empowerment, and has wide-ranging impacts, from improved labor market outcomes to reducing child mortality. Existing gender-specific evidence on the effect of armed conflict on education is conflict-specific and mixed. We link granular data on conflict events to georeferenced survey data on educational attainment from 28 countries in Africa, and use a regression-based approach to estimate the local effect of conflict exposure on female years of schooling. We find that conflict events occurring within 25 kilometers during a female child's primary school years reduces years of schooling by 0.4 years by adolescence. We do not find the same effect for males. Exposure to only low intensity conflict events with at most two casualties has persistent negative and significant effects. Consecutive years of conflict, however, can have positive effects in later years, which offset earlier negative effects, suggesting a habituation to violence. In the past two decades, we estimate excess child mortality in Africa associated with the indirect channel of women's education to be similar in magnitude to the number of direct child casualties due to conflict.
Mapping Opium Poppy Cultivation: Socioeconomic Insights from Satellite Imagery
ACM Journal on Computing and Sustainable Societies · 2024-02-16 · 5 citations
articleOpen accessSenior authorOver 30 million people globally consume illicit opiates. In recent decades, Afghanistan has accounted for 70–90% of the world’s illicit supply of opium. This production provides livelihoods to millions of Afghans, while also funneling hundreds of millions of dollars to insurgent groups every year, exacerbating corruption and insecurity, and impeding development. Remote sensing and field surveys are currently used in official estimates of total poppy cultivation area. These aggregate estimates are not suited to study the local socioeconomic conditions surrounding cultivation. Few avenues exist to generate comprehensive, fine-grained data under poor security conditions, without the use of costly surveys or data collection efforts. Here, we develop and test a new unsupervised approach to mapping cultivation using only freely available satellite imagery. For districts accounting for over 90% of total cultivation, our aggregate estimates track official statistics closely (correlation coefficient of 0.76 to 0.81). We combine these predictions with other grid-level data sources, finding that areas with poppy cultivation have poorer outcomes such as infant mortality and education, compared to areas with exclusively other agriculture. Surprisingly, poppy-growing areas have better healthcare accessibility. We discuss these findings, the limitations of mapping opium poppy cultivation, and associated ethical concerns.
Frontiers in Public Health · 2024-04-12 · 11 citations
articleOpen accessIntroduction: The rise in global temperatures due to climate change has escalated the frequency and intensity of wildfires worldwide. Beyond their direct impact on physical health, these wildfires can significantly impact mental health. Conventional mental health studies predominantly rely on surveys, often constrained by limited sample sizes, high costs, and time constraints. As a result, there is an increasing interest in accessing social media data to study the effects of wildfires on mental health. Methods: In this study, we focused on Twitter users affected by the California Tubbs Fire in 2017 to extract data signals related to emotional well-being and mental health. Our analysis aimed to investigate tweets posted during the Tubbs Fire disaster to gain deeper insights into their impact on individuals. Data were collected from October 8 to October 31, 2017, encompassing the peak activity period. Various analytical methods were employed to explore word usage, sentiment, temporal patterns of word occurrence, and emerging topics associated with the unfolding crisis. Results: The findings show increased user engagement on wildfire-related Tweets, particularly during nighttime and early morning, especially at the onset of wildfire incidents. Subsequent exploration of emotional categories using Linguistic Inquiry and Word Count (LIWC) revealed a substantial presence of negative emotions at 43.0%, juxtaposed with simultaneous positivity in 23.1% of tweets. This dual emotional expression suggests a nuanced and complex landscape, unveiling concerns and community support within conversations. Stress concerns were notably expressed in 36.3% of the tweets. The main discussion topics were air quality, emotional exhaustion, and criticism of the president's response to the wildfire emergency. Discussion: Social media data, particularly the data collected from Twitter during wildfires, provides an opportunity to evaluate the psychological impact on affected communities immediately. This data can be used by public health authorities to launch targeted media campaigns in areas and hours where users are more active. Such campaigns can raise awareness about mental health during disasters and connect individuals with relevant resources. The effectiveness of these campaigns can be enhanced by tailoring outreach efforts based on prevalent issues highlighted by users. This ensures that individuals receive prompt support and mitigates the psychological impacts of wildfire disasters.
Short-term exposure to fine particulate pollution and elderly mortality in Chile
Communications Earth & Environment · 2024-08-28 · 4 citations
articleOpen accessSenior authorAbstract Exposure to fine particulate matter (PM 2.5 ) is known to cause adverse health outcomes. Most of the evidence has been derived from developed countries, with lower pollution levels and different demographics and comorbidities from the rest of the world. Here we leverage new satellite-based measurements of PM 2.5 , combined with comprehensive public records in Chile, to study the effect of PM 2.5 pollution on elderly mortality. We find that a 10 μ g / m 3 monthly increase in PM 2.5 exposure is associated with a 1.7% increase (95% C.I.: 1.1–2.4%) in all-cause mortality for individuals aged 75+. Satellite-based measurements allow us to comprehensively investigate heterogeneous effects. We find remarkably similar effect sizes across baseline exposure, rural and urban areas, income, and over time, demonstrating consistency in the evidence on mortality effects of PM 2.5 exposure. The most notable source of heterogeneity is geographical, with effects closer to 5% in the center-south and in the metropolitan area.
Mobile phone data reveal the effects of violence on internal displacement in Afghanistan
Nature Human Behaviour · 2022 · 34 citations
1st authorCorresponding- Political Science
- Computer Security
- Computer Science
Nearly 50 million people globally have been internally displaced due to conflict, persecution and human rights violations. However, the study of internally displaced persons-and the design of policies to assist them-is complicated by the fact that these people are often underrepresented in surveys and official statistics. We develop an approach to measure the impact of violence on internal displacement using anonymized high-frequency mobile phone data. We use this approach to quantify the short- and long-term impacts of violence on internal displacement in Afghanistan, a country that has experienced decades of conflict. Our results highlight how displacement depends on the nature of violence. High-casualty events, and violence involving the Islamic State, cause the most displacement. Provincial capitals act as magnets for people fleeing violence in outlying areas. Our work illustrates the potential for non-traditional data sources to facilitate research and policymaking in conflict settings.
Public mobility data enables COVID-19 forecasting and management at local and global scales
Scientific Reports · 2021 · 133 citations
- Computer Science
- Computer Science
- Artificial Intelligence
Policymakers everywhere are working to determine the set of restrictions that will effectively contain the spread of COVID-19 without excessively stifling economic activity. We show that publicly available data on human mobility-collected by Google, Facebook, and other providers-can be used to evaluate the effectiveness of non-pharmaceutical interventions (NPIs) and forecast the spread of COVID-19. This approach uses simple and transparent statistical models to estimate the effect of NPIs on mobility, and basic machine learning methods to generate 10-day forecasts of COVID-19 cases. An advantage of the approach is that it involves minimal assumptions about disease dynamics, and requires only publicly-available data. We evaluate this approach using local and regional data from China, France, Italy, South Korea, and the United States, as well as national data from 80 countries around the world. We find that NPIs are associated with significant reductions in human mobility, and that changes in mobility can be used to forecast COVID-19 infections.
Poster - Mapping Opium Poppy Cultivation in Afghanistan Using Satellite Imagery
2021-06-28
article1st authorCorrespondingAfghanistan is the world’s largest supplier of illicit opium, accounting for an estimated 70-80% of supply. In 2019, this generated an estimated income of $1.2-$2.1 billion domestically, or around 10% of Afghanistan’s gross domestic product. The illicit drug economy has provided livelihoods to millions of Afghans, but has also had numerous negative effects, including funding insurgent groups, exacerbating corruption and insecurity, and contributing to high domestic levels of drug addiction. From 2002 to 2017, the U.S. government spent over $8 billion on counter-narcotics efforts in Afghanistan, achieving little long-term success. The lack of reliable data has contributed to this failure; the robustness and interpretation of top-line estimates of area under cultivation have been questioned and criticized. Counter-narcotics efforts have focused on reducing total cultivation area, rather than trying to understand local socioeconomic or political conditions. The lack of granularity in official cultivation statistics has also impeded efforts by aid agencies to evaluate the impact of various interventions aimed at transitioning farmers away from poppy.
Benchmarking Minimax Linkage in Hierarchical Clustering
Studies in classification, data analysis, and knowledge organization · 2021-01-01 · 1 citations
book-chapter1st authorCorresponding
Frequent coauthors
- 6 shared
Solomon Hsiang
- 5 shared
Shikhar Mehra
University of California, Berkeley
- 4 shared
Joshua Blumenstock
- 3 shared
Cornelia Ilin
University of California, Berkeley
- 3 shared
Heike Hofmann
Center for Statistics and Applications in Forensic Evidence
- 3 shared
Sébastien Annan-Phan
University of California, Berkeley
- 3 shared
William F. Eddy
Carnegie Mellon University
- 2 shared
Susan VanderPlas
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Xiao Hui Tai
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup