🤖 AI Summary
This work addresses the vulnerability of traditional one-hot encoding in adversarial settings, which induces overconfident model predictions and undermines entropy-based anomaly detection. While Hadamard coding has shown promise, it has not been previously applied to object detection, nor has it fully leveraged codeword redundancy for perturbation detection. To bridge this gap, the authors propose HadamardNet, a novel framework that introduces Hadamard coding into both object detection and semantic segmentation for the first time. By projecting predictions onto the probability simplex, HadamardNet enables optimal decoding to yield well-calibrated class probabilities. Furthermore, it efficiently detects adversarial perturbations in a single forward pass by measuring prediction inconsistency. The method achieves state-of-the-art performance in both adversarial and general perturbation detection while maintaining high accuracy on clean samples.
📝 Abstract
Conventional one-hot encodings often yield poorly calibrated models, being overconfident under attack, and letting entropy-based detection algorithms fail. Previous image classification works have demonstrated that Hadamard-coded output representations can improve adversarial robustness. However, attempts to integrate Hadamard codes into semantic segmentation fall far behind state-of-the-art models in mean intersection-over-union performance. Regarding object detection, such output encodings have not yet been investigated at all. Further, no prior art addressed intrinsic codeword inconsistencies or actually exploited intrinsic codeword redundancy. Accordingly, we first derive a novel decoding procedure for Hadamard codewords towards optimal class-wise probabilities, solving the underlying optimization problem by using the projection onto the probability simplex. Second, our optimization delivers a measure of prediction inconsistency. Third, we are the first to show how to exploit these inconsistencies for adversarial attack and disturbance detection. Fourth, we introduce HadamardNet, a framework employing Hadamard codes as output representations for semantic segmentation and object detection models and tasks. We conduct a comprehensive evaluation both on disturbances and adversarial attacks, achieving state-of-the-art perturbation detection performance for both tasks in only a single detection pass, while delivering equivalent or close-by reference performance on clean data.