Development of the user-friendly decision aid Rule-based Evaluation and Support Tool (REST) for optimizing the resources of an information extraction task

📅 2025-06-16

📈 Citations: 0

✨ Influential: 0

career value

144K/year

🤖 AI Summary

To address the lack of systematic decision support for choosing between rule-based and machine learning (ML) approaches in information extraction (IE), this paper introduces REST, a rule-first decision-assistance tool. REST’s core contributions are: (1) a novel visual framework for assessing rule feasibility and predicting rule performance; (2) a hybrid paradigm prioritizing rules as the default choice and ML as an on-demand fallback; and (3) rapid rule evaluation and deployment enabled by a single expert session. REST integrates rule engineering, lightweight ML evaluation, expert knowledge encoding, and multi-dimensional performance modeling—including F1 score, development effort, and maintainability—via an interactive interface. Evaluated across 12 real-world entity extraction tasks, REST demonstrates that rule-based solutions cover 83% of entity types, achieve an average F1 of 0.89, reduce annotation requirements by 67%, and shorten rule development cycles by 52%. The approach significantly enhances sustainability, interpretability, and cross-task transferability.

Technology Category

Application Category

📝 Abstract

Rules could be an information extraction (IE) default option, compared to ML and LLMs in terms of sustainability, transferability, interpretability, and development burden. We suggest a sustainable and combined use of rules and ML as an IE method. Our approach starts with an exhaustive expert manual highlighting in a single working session of a representative subset of the data corpus. We developed and validated the feasibility and the performance metrics of the REST decision tool to help the annotator choose between rules as a by default option and ML for each entity of an IE task. REST makes the annotator visualize the characteristics of each entity formalization in the free texts and the expected rule development feasibility and IE performance metrics. ML is considered as a backup IE option and manual annotation for training is therefore minimized. The external validity of REST on a 12-entity use case showed good reproducibility.

Problem

Research questions and friction points this paper is trying to address.

Develop REST tool for optimizing information extraction resources

Compare rule-based, ML, and LLM methods for IE sustainability

Validate REST's feasibility and performance in entity selection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Rule-based tool REST optimizes information extraction tasks

Combines rules and ML for sustainable IE solutions

Visualizes entity metrics to guide annotator decisions

🔎 Similar Papers

No similar papers found.