🤖 AI Summary
Finger-vein recognition has long been hindered by the scarcity of large-scale, highly diverse public datasets. To address this, we introduce FingerVeinSyn-5M—the first million-scale synthetic finger-vein dataset—comprising 5 million samples from 50,000 unique fingers and incorporating 100 realistic imaging degradations (e.g., motion/optical blur, exposure variations). Our core contribution is FVeinSyn, a high-fidelity synthetic engine that achieves fine-grained decoupling of physiological anatomy and imaging physics for the first time, enabling anatomically guided texture generation and controllable, multi-factor degradation simulation. FingerVeinSyn-5M is fully annotated, large-scale, and multi-degradation benchmark. Models pretrained on it require only minimal real-data fine-tuning and achieve an average 53.91% performance gain across multiple benchmarks. The dataset is publicly released to advance the practical deployment of deep learning–based finger-vein recognition.
📝 Abstract
A major challenge in finger vein recognition is the lack of large-scale public datasets. Existing datasets contain few identities and limited samples per finger, restricting the advancement of deep learning-based methods. To address this, we introduce FVeinSyn, a synthetic generator capable of producing diverse finger vein patterns with rich intra-class variations. Using FVeinSyn, we created FingerVeinSyn-5M -- the largest available finger vein dataset -- containing 5 million samples from 50,000 unique fingers, each with 100 variations including shift, rotation, scale, roll, varying exposure levels, skin scattering blur, optical blur, and motion blur. FingerVeinSyn-5M is also the first to offer fully annotated finger vein images, supporting deep learning applications in this field. Models pretrained on FingerVeinSyn-5M and fine-tuned with minimal real data achieve an average 53.91% performance gain across multiple benchmarks. The dataset is publicly available at: https://github.com/EvanWang98/FingerVeinSyn-5M.