WeNLEX: Weakly Supervised Natural Language Explanations for Multilabel Chest X-ray Classification

📅 2026-03-19

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the limited faithfulness of existing medical image explanation methods, which rely on strong supervision and often produce rationales misaligned with the model’s actual reasoning. To overcome this, the authors propose WeNLEX, a weakly supervised explanation generation framework that requires only five human-annotated explanations per class. Faithfulness is ensured by aligning reconstructed explanation images with original inputs in feature space, while linguistic plausibility is maintained using a small set of clinical labels. WeNLEX supports both post-hoc and joint training paradigms and can be tailored to diverse audiences. Experimental results demonstrate its superior performance across multiple metrics, with joint training yielding a 2.21% improvement in classification AUC, thereby confirming that faithful explanations can positively enhance primary task performance.

Technology Category

Application Category

📝 Abstract

Natural language explanations provide an inherently human-understandable way to explain black-box models, closely reflecting how radiologists convey their diagnoses in textual reports. Most works explicitly supervise the explanation generation process using datasets annotated with explanations. Thus, though plausible, the generated explanations are not faithful to the model's reasoning. In this work, we propose WeNLEX, a weakly supervised model for the generation of natural language explanations for multilabel chest X-ray classification. Faithfulness is ensured by matching images generated from their corresponding natural language explanations with original images, in the black-box model's feature space. Plausibility is maintained via distribution alignment with a small database of clinician-annotated explanations. We empirically demonstrate, through extensive validation on multiple metrics to assess faithfulness, simulatability, diversity, and plausibility, that WeNLEX is able to produce faithful and plausible explanations, using as little as 5 ground-truth explanations per diagnosis. Furthermore, WeNLEX can operate in both post-hoc and in-model settings. In the latter, i.e., when the multilabel classifier is trained together with the rest of the network, WeNLEX improves the classification AUC of the standalone classifier by 2.21%, thus showing that adding interpretability to the training process can actually increase the downstream task performance. Additionally, simply by changing the database, WeNLEX explanations are adaptable to any target audience, and we showcase this flexibility by training a layman version of WeNLEX, where explanations are simplified for non-medical users.

Problem

Research questions and friction points this paper is trying to address.

natural language explanations

multilabel chest X-ray classification

faithfulness

weak supervision

interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

weakly supervised explanation

faithful natural language explanation

multilabel chest X-ray classification

feature-space alignment

audience-adaptive interpretability

🔎 Similar Papers

No similar papers found.

Authors to Follow