CAMBench-QR : A Structure-Aware Benchmark for Post-Hoc Explanations with QR Understanding

📅 2025-09-20

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Visual explanation methods (e.g., CAM) often lack structural faithfulness—failing to precisely localize essential substructures while suppressing background interference. To address this, we introduce the first benchmark grounded in QR code geometric standards, incorporating a structure-aware evaluation framework that jointly leverages causal masking and fidelity analysis. Crucially, we pioneer the use of QR code geometric priors for quantitative interpretability assessment. Our benchmark systematically evaluates mainstream CAM variants—including LayerCAM and EigenGrad-CAM—across structural sensitivity, robustness to controlled deformations, and inference latency, using synthetically generated QR/non-QR data, pixel-accurate ground-truth masks, and structure-aware distance metrics. It supports both zero-shot and fine-tuned evaluation, and provides a fully reproducible implementation with training protocols. Empirical results validate its effectiveness as a rigorous “litmus test” for structural interpretability, demonstrating strong generalizability across models and tasks.

Technology Category

Application Category

📝 Abstract

Visual explanations are often plausible but not structurally faithful. We introduce CAMBench-QR, a structure-aware benchmark that leverages the canonical geometry of QR codes (finder patterns, timing lines, module grid) to test whether CAM methods place saliency on requisite substructures while avoiding background. CAMBench-QR synthesizes QR/non-QR data with exact masks and controlled distortions, and reports structure-aware metrics (Finder/Timing Mass Ratios, Background Leakage, coverage AUCs, Distance-to-Structure) alongside causal occlusion, insertion/deletion faithfulness, robustness, and latency. We benchmark representative, efficient CAMs (LayerCAM, EigenGrad-CAM, XGrad-CAM) under two practical regimes of zero-shot and last-block fine-tuning. The benchmark, metrics, and training recipes provide a simple, reproducible yardstick for structure-aware evaluation of visual explanations. Hence we propose that CAMBENCH-QR can be used as a litmus test of whether visual explanations are truly structure-aware.

Problem

Research questions and friction points this paper is trying to address.

Evaluating whether visual explanation methods identify correct structural components

Testing if saliency methods focus on essential substructures while ignoring background

Providing standardized metrics to assess structure-aware faithfulness of visual explanations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses QR code geometry as structural ground truth

Synthesizes controlled QR/non-QR data with exact masks

Introduces structure-aware metrics and benchmarking protocols

🔎 Similar Papers

Chrono: A Simple Blueprint for Representing Time in MLLMs