A symbolic Perl algorithm for the unification of Nahuatl word spellings

📅 2025-11-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of highly variable and nonstandardized orthography in Nahuatl—a low-resource indigenous language—which severely limits performance in NLP tasks. To tackle this, we propose a linguistically grounded, symbolic spelling normalization method. Our approach designs interpretable orthographic regular expressions and implements a lightweight, Perl-based rule engine, validated on the π-yalli corpus containing multiple spelling variants. Innovatively, we introduce a sentence-level semantic evaluation protocol, manually assessing normalized outputs along two dimensions: semantic acceptability and linguistic accuracy. Experimental results demonstrate that our method significantly improves semantic coherence and morphological consistency of normalized texts while preserving full interpretability. The framework offers a reproducible, scalable, and symbolic paradigm for orthographic standardization of low-resource indigenous languages.

Technology Category

Application Category

📝 Abstract
In this paper, we describe a symbolic model for the automatic orthographic unification of Nawatl text documents. Our model is based on algorithms that we have previously used to analyze sentences in Nawatl, and on the corpus called $π$-yalli, consisting of texts in several Nawatl orthographies. Our automatic unification algorithm implements linguistic rules in symbolic regular expressions. We also present a manual evaluation protocol that we have proposed and implemented to assess the quality of the unified sentences generated by our algorithm, by testing in a sentence semantic task. We have obtained encouraging results from the evaluators for most of the desired features of our artificially unified sentences
Problem

Research questions and friction points this paper is trying to address.

Automatically unify orthographic variations in Nahuatl texts
Implement linguistic rules through symbolic regular expressions
Evaluate unification quality via manual semantic assessment protocol
Innovation

Methods, ideas, or system contributions that make the work stand out.

Symbolic Perl algorithm for orthographic unification
Linguistic rules implemented via symbolic regular expressions
Manual evaluation protocol for semantic quality assessment
🔎 Similar Papers
No similar papers found.
J
Juan-José Guzmán-Landa
LIA, Avignon Université, Avignon, France
J
Jesús Vázquez-Osorio
GIL, Universidad Nacional Autónoma de México, Coyoacán, CDMX, Mexico
Juan-Manuel Torres-Moreno
Juan-Manuel Torres-Moreno
Université d'Avignon / Polytechnique Montréal
Traitement Automatique des LanguesNahuatlLanguage EngineeringSummarization
L
Ligia Quintana Torres
Fac. de Matemáticas, Universidad Veracruzana, Xalapa, Mexico
M
Miguel Figueroa-Saavedra
Inst. de Investigaciones en Educación, Universidad Veracruzana, Xalapa, Mexico
M
Martha-Lorena Avendaño-Garrido
Fac. de Matemáticas, Universidad Veracruzana, Xalapa, Mexico
G
Graham Ranger
ICTT, Avignon Université, Avignon, France
P
Patricia Velázquez-Morales
Avignon, France
G
Gerardo Eugenio Sierra Martínez
GIL, Universidad Nacional Autónoma de México, Coyoacán, CDMX, Mexico