🤖 AI Summary
Computational morphology remains underutilized in linguistic fieldwork, revealing a critical disconnect between NLP research and real-world language documentation practice. Method: Guided by user-centered design (UCD), this work systematically diagnoses the roots of this gap and proposes a new research agenda integrating practical utility with scholarly rigor. We enhance the multilingual interlinear glossed text (IGT) generation model GlossLM and conduct a small-scale field user study with language documenters. Contribution/Results: We identify key usability bottlenecks—including insufficient gloss tag standardization, tokenization ambiguity, and lack of personalization support. Results show that while current models achieve strong automatic evaluation scores, they fail to meet documenters’ core needs for interpretability, controllability, and seamless workflow integration. This is the first study to systematically apply UCD principles to computational morphology, establishing a methodological framework and empirical pathway for developing NLP tools purpose-built for language archiving.
📝 Abstract
Computational morphology has the potential to support language documentation through tasks like morphological segmentation and the generation of Interlinear Glossed Text (IGT). However, our research outputs have seen limited use in real-world language documentation settings. This position paper situates the disconnect between computational morphology and language documentation within a broader misalignment between research and practice in NLP and argues that the field risks becoming decontextualized and ineffectual without systematic integration of User-Centered Design (UCD). To demonstrate how principles from UCD can reshape the research agenda, we present a case study of GlossLM, a state-of-the-art multilingual IGT generation model. Through a small-scale user study with three documentary linguists, we find that despite strong metric based performance, the system fails to meet core usability needs in real documentation contexts. These insights raise new research questions around model constraints, label standardization, segmentation, and personalization. We argue that centering users not only produces more effective tools, but surfaces richer, more relevant research directions