Concordance Comparison as a Means of Assembling Local Grammars

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This study addresses the insufficient accuracy of person name recognition in Portuguese texts by proposing a local grammar assembly strategy based on co-occurrence comparison. By systematically analyzing inclusion, intersection, and mutual exclusion relationships between co-occurrence indices derived from two sets of local grammars, the method effectively selects and integrates complementary rules to enhance named entity recognition performance. Evaluated on the HAREM II Gold Collection dataset, the approach achieves an F1 score of 76.86, representing a six-percentage-point improvement over the current state-of-the-art results for Portuguese named entity recognition and significantly advancing rule-based person name identification.

📝 Abstract

Named Entity Recognition for person names is an important but non-trivial task in information extraction. This article uses a tool that compares the concordances obtained from two local grammars (LG) and highlights the differences. We used the results as an aid to select the best of a set of LGs. By analyzing the comparisons, we observed relationships of inclusion, intersection and disjunction within each pair of LGs, which helped us to assemble those that yielded the best results. This approach was used in a case study on extraction of person names from texts written in Portuguese. We applied the enhanced grammar to the Gold Collection of the Second HAREM. The F-Measure obtained was 76.86, representing a gain of 6 points in relation to the state-of-the-art for Portuguese.

Problem

Research questions and friction points this paper is trying to address.

Named Entity Recognition

Local Grammars

Concordance Comparison

Person Names

Information Extraction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Concordance Comparison

Local Grammars

Named Entity Recognition