🤖 AI Summary
Selective rationalization’s select-then-predict architecture suffers from interlocking—mutual dependency between the generator and predictor during joint training—leading to suboptimal equilibria. Existing approaches only mitigate this issue heuristically, via sampling or ad hoc regularization, without eliminating its root cause. This paper introduces the first interlocking-free selective rationalization framework: it decouples the generator and predictor modules and jointly optimizes them globally using a genetic algorithm, thereby circumventing equilibrium traps entirely. The framework incurs no additional learning overhead and remains end-to-end trainable. Experiments on both synthetic and real-world datasets demonstrate that our method significantly outperforms multiple state-of-the-art baselines in both explanation quality and prediction accuracy, validating the effectiveness and generalizability of the interlocking-free design.
📝 Abstract
A popular end-to-end architecture for selective rationalization is the select-then-predict pipeline, comprising a generator to extract highlights fed to a predictor. Such a cooperative system suffers from suboptimal equilibrium minima due to the dominance of one of the two modules, a phenomenon known as interlocking. While several contributions aimed at addressing interlocking, they only mitigate its effect, often by introducing feature-based heuristics, sampling, and ad-hoc regularizations. We present GenSPP, the first interlocking-free architecture for selective rationalization that does not require any learning overhead, as the above-mentioned. GenSPP avoids interlocking by performing disjoint training of the generator and predictor via genetic global search. Experiments on a synthetic and a real-world benchmark show that our model outperforms several state-of-the-art competitors.