Automatic Intermodal Loading Unit Identification using Computer Vision: A Scoping Review

📅 2025-09-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automatic identification of intermodal loading units (e.g., shipping containers, semi-trailers) in high-throughput ports remains hindered by low efficiency and poor robustness. Method: This study systematically reviews computer vision–based identification methods from 1989 to 2024, tracing the evolution from classical image processing to deep learning. Through cross-method comparative analysis, it identifies three primary causes of wide accuracy variance (5%–96%): inconsistent terminology, absence of standardized public benchmark datasets, and limited adaptability to dynamic operational scenarios. Contribution/Results: We propose, for the first time, an open, rigorously annotated multi-source, multi-scenario dataset framework. We identify three critical research directions: context-free text recognition, mobile-camera collaborative perception, and scene-text detection. This establishes a reproducible, comparable evaluation paradigm, significantly accelerating the transition of vision-based identification technologies from research to real-world port deployment.

Technology Category

Application Category

📝 Abstract
The standardisation of Intermodal Loading Units (ILUs), such as containers, semi-trailers and swap bodies, has revolutionised global trade yet their efficient and robust identification remains a critical bottleneck in high-throughput ports and terminals. This paper reviews 63 empirical studies that propose computer vision (CV) based solutions. It covers the last 35 years (1990-2025), tracing the field's evolution from early digital image processing (DIP) and traditional machine learning (ML) to the current dominance of deep learning (DL) techniques. While CV offers cost-effective alternatives for other types of identification techniques, its development is hindered by the lack of publicly available benchmarking datasets. This results in high variance for the reported results such as end-to-end accuracy ranging from 5 % to 96 %. Beyond dataset limitations, this review highlights the emerging challenges especially introduced by the shift from character-based text recognition to scene-text spotting and the integration of mobile cameras (e.g. drones, sensor equipped ground vehicles) for dynamic terminal monitoring. To advance the field, the paper calls for standardised terminology, open-access datasets, shared source code, while outlining future research directions such as contextless text recognition optimised for ISO6346 codes.
Problem

Research questions and friction points this paper is trying to address.

Identifying Intermodal Loading Units efficiently remains a bottleneck in ports
Computer vision development is hindered by lack of public benchmarking datasets
Challenges include scene-text spotting and mobile camera integration for monitoring
Innovation

Methods, ideas, or system contributions that make the work stand out.

Computer vision evolution from DIP to deep learning dominance
Mobile cameras enable dynamic terminal monitoring solutions
Contextless text recognition optimized for ISO6346 standard codes
🔎 Similar Papers
No similar papers found.
E
Emre Gülsoylu
University of Hamburg, Department of Informatics, Computer Vision Group, Hamburg, Germany
A
Alhassan Abdelhalim
University of Hamburg, Department of Informatics, Distributed Operating Systems Group, Hamburg, Germany
D
Derya Kara Boztas
Hamburg University of Technology, Institute of Maritime Logistics, Hamburg, Germany
O
Ole Grasse
Hamburg University of Technology, Institute of Maritime Logistics, Hamburg, Germany
C
Carlos Jahn
Hamburg University of Technology, Institute of Maritime Logistics, Hamburg, Germany
Simone Frintrop
Simone Frintrop
University of Hamburg, Department of Informatics, Computer Vision Group, Hamburg, Germany
Janick Edinger
Janick Edinger
Universität Hamburg
Distributed ComputingEdge ComputingContext-Aware ComputingAssistive TechnologiesComputation Offloading