MHAFF: Multi-Head Attention Feature Fusion of CNN and Transformer for Cattle Identification

📅 2025-01-09

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Convolutional Neural Networks (CNNs) struggle to model long-range dependencies in cattle nasal print recognition, while conventional feature fusion strategies (e.g., addition or concatenation) discard discriminative information and neglect cross-modal interactions between local texture and global structure. Method: We propose a Multi-Head Attention-based Feature Fusion (MHAFF) mechanism—the first to integrate multi-head self-attention into a CNN–Transformer dual-stream architecture—dynamically modeling correlations between CNN-extracted local textures and Transformer-captured global structural patterns, preserving feature fidelity while enhancing interactive representation. Results: On two public cattle nasal print datasets, MHAFF achieves 99.88% and 99.52% identification accuracy, respectively—significantly surpassing existing fusion paradigms and state-of-the-art methods—while demonstrating faster convergence and stronger generalization.

Technology Category

Application Category

📝 Abstract

Convolutional Neural Networks (CNNs) have drawn researchers' attention to identifying cattle using muzzle images. However, CNNs often fail to capture long-range dependencies within the complex patterns of the muzzle. The transformers handle these challenges. This inspired us to fuse the strengths of CNNs and transformers in muzzle-based cattle identification. Addition and concatenation have been the most commonly used techniques for feature fusion. However, addition fails to preserve discriminative information, while concatenation results in an increase in dimensionality. Both methods are simple operations and cannot discover the relationships or interactions between fusing features. This research aims to overcome the issues faced by addition and concatenation. This research introduces a novel approach called Multi-Head Attention Feature Fusion (MHAFF) for the first time in cattle identification. MHAFF captures relations between the different types of fusing features while preserving their originality. The experiments show that MHAFF outperformed addition and concatenation techniques and the existing cattle identification methods in accuracy on two publicly available cattle datasets. MHAFF demonstrates excellent performance and quickly converges to achieve optimum accuracy of 99.88% and 99.52% in two cattle datasets simultaneously.

Problem

Research questions and friction points this paper is trying to address.

Cattle Recognition

CNN-Transformer Integration

Information Fusion

Innovation

Methods, ideas, or system contributions that make the work stand out.

MHAFF

Multi-Head Attention

Feature Fusion

🔎 Similar Papers

Muzzle-Based Cattle Identification System Using Artificial Intelligence (AI)