
Alexis Palmer
· Associate ProfessorVerifiedUniversity of Colorado Boulder · Linguistics
Active 2004–2025
About
Dr. Alexis Palmer is an Associate Professor in the Department of Linguistics at the University of Colorado Boulder. She is an expert in computational discourse and semantics, with a focus on computational linguistics for low-resource languages and language documentation. Her research encompasses discourse structure and coherence, modes of discourse, and social analytics, including the automated detection of offensive language in social media. She received her PhD from the University of Texas at Austin in 2009 and has held prestigious postdoctoral and research positions in Germany, including at the Institute for Computational Linguistics in Heidelberg and the Institut für Deutsche Sprache in Mannheim. Prior to her current position at CU, she was an assistant professor at the University of North Texas, Denton, until 2021. Dr. Palmer has secured a National Science Foundation CAREER grant, working on cross-linguistic methods to improve language processing tools for low-resource languages through her project FOLTA (From One Language to Another). Recently, she has also become interested in making linguistic documentation more accessible and useful, particularly for developing pedagogical materials for languages.
Research signals
Five dimensions sourced from public faculty / publication signals. Sign in to compare against your own profile and see your match score.
Research topics
- Computer Science
- Natural Language Processing
- Linguistics
- Artificial Intelligence
- Humanities
- Philosophy
- Programming language
- Engineering
- History
- Botany
- Physics
- Cognitive science
- Biology
- Psychology
- Classics
- Art
- Ecology
- Library science
Selected publications
2025-01-01
articleOpen accessSenior authorComputational morphology has the potential to support language documentation through tasks like morphological segmentation and the generation of Interlinear Glossed Text (IGT).However, our research outputs have seen limited use in real-world language documentation settings.This position paper situates the disconnect between computational morphology and language documentation within a broader misalignment between research and practice in NLP and argues that the field risks becoming decontextualized and ineffectual without systematic integration of User-Centered Design (UCD).To demonstrate how principles from UCD can reshape the research agenda, we present a case study of GlossLM, a stateof-the-art multilingual IGT generation model.Through a small-scale user study with three documentary linguists, we find that, despite strong metric-based performance, the system fails to meet core usability needs in real documentation contexts.These insights raise new research questions around model constraints, label standardization, segmentation, and personalization.We argue that centering users not only produces more effective tools, but surfaces richer, more relevant research directions.
Bootstrapping UMRs from Universal Dependencies for Scalable Multilingual Annotation
2025-01-01
articleOpen accessUniform Meaning Representation (UMR) is a semantic annotation framework designed to be applicable across typologically diverse languages.However, UMR annotation is a laborintensive task, requiring significant effort and time especially when no prior annotations are available.In this paper, we present a method for bootstrapping UMR graphs by leveraging Universal Dependencies (UD), one of the most comprehensive multilingual resources, encompassing languages across a wide range of language families.Given UMR's strong typological and cross-linguistic orientation, UD serves as a particularly suitable starting point for the conversion.We describe and evaluate an approach that automatically derives partial UMR graphs from UD trees, providing annotators with an initial representation to build upon.While UD is not a semantic resource, our method extracts useful structural information that aligns with the UMR formalism, thereby facilitating the annotation process.By leveraging UD's broad typological coverage, this approach offers a scalable way to support UMR annotation across different languages.
Dehumanization of LGBTQ+ Groups in Sexual Interactions with ChatGPT
2025-01-01 · 1 citations
articleOpen accessGiven the widespread use of LLM-powered conversational agents such as ChatGPT, analyzing the ways people interact with them could provide valuable insights into human behavior.Prior work has shown that these agents are sometimes used in sexual contexts, such as to obtain advice, to role-play as sexual companions, or to generate erotica.While LGBTQ+ acceptance has increased in recent years, dehumanizing practices against minorities continue to prevail.In this paper, we hone in on this and perform an analysis of dehumanizing tendencies toward LGBTQ+ individuals by human users in their sexual interactions with ChatGPT.Through a series of experiments that model various concept vectors associated with distinct shades of dehumanization, we find evidence of the reproduction of harmful stereotypes.However, many user prompts lack indications of dehumanization, suggesting that the use of these agents is a complex and nuanced issue which warrants further investigation.
Understanding the Gap: an Analysis of Research Collaborations in NLP and Language Documentation
2025-01-01
articleOpen accessDespite over 20 years of NLP work explicitly intended for application in language documentation (LD), practical use of this work remains vanishingly scarce.This issue has been noted and discussed over the past 10 years, but without the benefit of data to inform the discourse.To address this lack in the literature, we present a survey-and interview-based analysis of the lack of adoption of NLP in LD, focusing on the matter of collaborations between documentary linguists and NLP researchers.Our data show support for ideas from previous work but also reveal the importance of little-discussed factors such as misaligned professional incentives, technical knowledge burdens, and LD software.
LLM Dependency Parsing with In-Context Rules
2025-01-01 · 1 citations
articleOpen accessSenior authorWe study whether incorporating symbolic rules can aid large language models in dependency parsing.We consider a paradigm in which LLMs first produce symbolic rules given fully labeled examples, and the rules are then provided in a subsequent call that performs the actual parsing.In addition, we experiment with providing human-created annotation guidelines in-context to the LLMs.We find that while both methods for rule incorporation improve zero-shot performance, the benefit disappears with a few labeled in-context examples.
ArXiv.org · 2025-09-12
preprintOpen accessSenior authorComputational morphology has the potential to support language documentation through tasks like morphological segmentation and the generation of Interlinear Glossed Text (IGT). However, our research outputs have seen limited use in real-world language documentation settings. This position paper situates the disconnect between computational morphology and language documentation within a broader misalignment between research and practice in NLP and argues that the field risks becoming decontextualized and ineffectual without systematic integration of User-Centered Design (UCD). To demonstrate how principles from UCD can reshape the research agenda, we present a case study of GlossLM, a state-of-the-art multilingual IGT generation model. Through a small-scale user study with three documentary linguists, we find that despite strong metric based performance, the system fails to meet core usability needs in real documentation contexts. These insights raise new research questions around model constraints, label standardization, segmentation, and personalization. We argue that centering users not only produces more effective tools, but surfaces richer, more relevant research directions
2025-01-01
articleOpen accessSenior authorCross-lingual transfer learning is an invaluable tool for overcoming data scarcity, yet selecting a suitable transfer language remains a challenge.The precise roles of linguistic typology, training data, and model architecture in transfer language choice are not fully understood.We take a holistic approach, examining how both dataset-specific and fine-grained typological features influence transfer language selection for part-of-speech tagging, considering two different sources for morphosyntactic features.While previous work examines these dynamics in the context of bilingual biLSTMS, we extend our analysis to a more modern transfer learning pipeline: zero-shot prediction with pretrained multilingual models.We train a series of transfer language ranking systems and examine how different feature inputs influence ranker performance across architectures.Word overlap, type-token ratio, and genealogical distance emerge as top features across all architectures.Our findings reveal that a combination of typological and dataset-dependent features leads to the best rankings, and that good performance can be obtained with either feature group on its own.
Decomposing Fusional Morphemes with Vector Embeddings
2024-01-01
articleOpen accessSenior authorBELT: Building Endangered Language Technology
2024-01-01
articleOpen accessSenior authorThe development of language technology (LT) for an endangered language is often identified as a goal in language revitalization efforts, but developing such technologies is typically subject to additional methodological challenges as well as social and ethical concerns.In particular, LT development has too often taken on colonialist qualities, extracting language data, relying on outside experts, and denying the speakers of a language sovereignty over the technologies produced.We seek to avoid such an approach through the development of the Building Endangered Language Technology (BELT) website, an educational resource designed for speakers and community members with limited technological experience to develop LTs for their own language.Specifically, BELT provides interactive lessons on basic Python programming, coupled with projects to develop specific language technologies, such as spellcheckers or word games.In this paper, we describe BELT's design, the motivation underlying many key decisions, and preliminary responses from learners.
From Priest to Doctor: Domain Adaptation for Low-Resource Neural Machine Translation
arXiv (Cornell University) · 2024-12-01 · 1 citations
preprintOpen accessMany of the world's languages have insufficient data to train high-performing general neural machine translation (NMT) models, let alone domain-specific models, and often the only available parallel data are small amounts of religious texts. Hence, domain adaptation (DA) is a crucial issue faced by contemporary NMT and has, so far, been underexplored for low-resource languages. In this paper, we evaluate a set of methods from both low-resource NMT and DA in a realistic setting, in which we aim to translate between a high-resource and a low-resource language with access to only: a) parallel Bible data, b) a bilingual dictionary, and c) a monolingual target-domain corpus in the high-resource language. Our results show that the effectiveness of the tested methods varies, with the simplest one, DALI, being most effective. We follow up with a small human evaluation of DALI, which shows that there is still a need for more careful investigation of how to accomplish DA for low-resource NMT.
Frequent coauthors
- 47 shared
John E. Ortega
- 47 shared
Abteen Ebrahimi
- 47 shared
Manuel Mager
- 47 shared
Rolando Coto‐Solano
- 47 shared
Arturo Oncevay
- 47 shared
Katharina Kann
University of Colorado Boulder
- 31 shared
Enora Rice
- 27 shared
Shruti Rijhwani
Education
- 2009
Ph.D.
UT Austin
Awards & honors
- NSF CAREER Award
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Alexis Palmer
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup