๐ค AI Summary
Fabric defect detection faces two key bottlenecks: (1) conventional non-maximum suppression (NMS) is non-differentiable, obstructing end-to-end gradient flow; and (2) pixel-level annotations are prohibitively expensive, hindering industrial deployment. To address these, we propose the first differentiable NMS alternative by formulating bounding-box matching as an entropy-regularized Sinkhorn differentiable bipartite matching problem. We further introduce an uncertainty-aware mask refinement module to achieve high-fidelity localization under weak supervision. Our method jointly models feature similarity, spatial relationships, and entropy regularization, and employs the SinkhornโKnopp algorithm for efficient, fully differentiable optimization. Evaluated on the Tianchi Fabric Defect Dataset, our approach significantly outperforms state-of-the-art methods in localization accuracy while maintaining real-time inference speed and seamless compatibility with mainstream detection architectures.
๐ Abstract
Fabric defect detection confronts two fundamental challenges. First, conventional non-maximum suppression disrupts gradient flow, which hinders genuine end-to-end learning. Second, acquiring pixel-level annotations at industrial scale is prohibitively costly. Addressing these limitations, we propose a differentiable NMS framework for fabric defect detection that achieves superior localization precision through end-to-end optimization. We reformulate NMS as a differentiable bipartite matching problem solved through the Sinkhorn-Knopp algorithm, maintaining uninterrupted gradient flow throughout the network. This approach specifically targets the irregular morphologies and ambiguous boundaries of fabric defects by integrating proposal quality, feature similarity, and spatial relationships. Our entropy-constrained mask refinement mechanism further enhances localization precision through principled uncertainty modeling. Extensive experiments on the Tianchi fabric defect dataset demonstrate significant performance improvements over existing methods while maintaining real-time speeds suitable for industrial deployment. The framework exhibits remarkable adaptability across different architectures and generalizes effectively to general object detection tasks.