Evaluating Deep Learning Models for African Wildlife Image Classification: From DenseNet to Vision Transformers

📅 2025-07-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses automatic classification of African wildlife images to support biodiversity monitoring. We systematically evaluate four deep learning architectures—DenseNet-201, ResNet-152, EfficientNet-B4, and ViT-H/14—via transfer learning on a localized African dataset comprising buffalo, elephant, rhinoceros, and zebra. To our knowledge, this is the first comparative study of CNNs and vision transformers in an African wildlife context. Results show ViT-H/14 achieves the highest accuracy (99%) but incurs prohibitive computational cost; DenseNet-201 attains 67% accuracy—the best among lightweight CNNs. Based on this trade-off analysis, we propose a model selection strategy balancing accuracy and deployment feasibility for resource-constrained field settings. Furthermore, we integrate the optimal CNN into a Hugging Face Gradio web application, enabling real-time, on-the-fly wildlife identification in野外 environments.

Technology Category

Application Category

📝 Abstract
Wildlife populations in Africa face severe threats, with vertebrate numbers declining by over 65% in the past five decades. In response, image classification using deep learning has emerged as a promising tool for biodiversity monitoring and conservation. This paper presents a comparative study of deep learning models for automatically classifying African wildlife images, focusing on transfer learning with frozen feature extractors. Using a public dataset of four species: buffalo, elephant, rhinoceros, and zebra; we evaluate the performance of DenseNet-201, ResNet-152, EfficientNet-B4, and Vision Transformer ViT-H/14. DenseNet-201 achieved the best performance among convolutional networks (67% accuracy), while ViT-H/14 achieved the highest overall accuracy (99%), but with significantly higher computational cost, raising deployment concerns. Our experiments highlight the trade-offs between accuracy, resource requirements, and deployability. The best-performing CNN (DenseNet-201) was integrated into a Hugging Face Gradio Space for real-time field use, demonstrating the feasibility of deploying lightweight models in conservation settings. This work contributes to African-grounded AI research by offering practical insights into model selection, dataset preparation, and responsible deployment of deep learning tools for wildlife conservation.
Problem

Research questions and friction points this paper is trying to address.

Evaluate deep learning models for African wildlife image classification
Compare performance and computational costs of various models
Assess deployability of models in conservation settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transfer learning with frozen feature extractors
Comparative study of DenseNet ResNet EfficientNet ViT
Hugging Face Gradio for real-time field deployment
🔎 Similar Papers
No similar papers found.
Lukman Jibril Aliyu
Lukman Jibril Aliyu
Zipline, Nigeria
Natural Language ProcessingBiomedical InformaticsHealth Systems Improvement
U
Umar Sani Muhammad
Azman University, Kano, Nigeria
B
Bilqisu Ismail
Arewa Data Science Academy, Kano, Nigeria
N
Nasiru Muhammad
Arewa Data Science Academy, Kano, Nigeria
A
Almustapha A. Wakili
Towson University, Maryland, USA
S
Seid Muhie Yimam
Universität Hamburg, Hamburg, Germany
Shamsuddeen Hassan Muhammad
Shamsuddeen Hassan Muhammad
Bayero University, Kano, & Google DeepMind Academic Fellow at Imperial College London
Natural Language ProcessingSentiment AnalysisAfricaNLPLow-resource NLPMultilinguality
M
Mustapha Abdullahi
Arewa Data Science Academy, Kano, Nigeria