🤖 AI Summary
Archaeological catalog digitization faces challenges including high heterogeneity of PDF materials, inconsistent documentation standards, and absence of geometric information. Method: This paper proposes an AI-driven archaeological data reconstruction workflow integrating object detection, spatial-semantic parsing, and interactive verification. Specifically, it combines YOLOv8-based detection, adaptive image segmentation, contour extraction, and orientation/scale inference guided by directional markers and scale bars to automatically identify line drawings and photographs of typical artifacts (e.g., tombs, ceramics, skeletal remains), infer real-world dimensions, and generate structured geometric contours. Contribution/Results: The approach breaks from traditional landmark-based morphometrics by enabling fully automated contour representation for the first time. Evaluated on Early Bronze Age European tomb catalogs (3rd millennium BC), it achieves >89% detection accuracy, improves data standardization efficiency fivefold, supports high-precision human-in-the-loop correction, and has been validated by professional archaeologists.
📝 Abstract
The context of this paper is the creation of large uniform archaeological datasets from heterogeneous published resources, such as find catalogues - with the help of AI and Big Data. The paper is concerned with the challenge of consistent assemblages of archaeological data. We cannot simply combine existing records, as they differ in terms of quality and recording standards. Thus, records have to be recreated from published archaeological illustrations. This is only a viable path with the help of automation. The contribution of this paper is a new workflow for collecting data from archaeological find catalogues available as legacy resources, such as archaeological drawings and photographs in large unsorted PDF files; the workflow relies on custom software (AutArch) supporting image processing, object detection, and interactive means of validating and adjusting automatically retrieved data. We integrate artificial intelligence (AI) in terms of neural networks for object detection and classification into the workflow, thereby speeding up, automating, and standardising data collection. Objects commonly found in archaeological catalogues - such as graves, skeletons, ceramics, ornaments, stone tools and maps - are detected. Those objects are spatially related and analysed to extract real-life attributes, such as the size and orientation of graves based on the north arrow and the scale. We also automate recording of geometric whole-outlines through contour detection, as an alternative to landmark-based geometric morphometrics. Detected objects, contours, and other automatically retrieved data can be manually validated and adjusted. We use third millennium BC Europe (encompassing cultures such as 'Corded Ware' and 'Bell Beaker', and their burial practices) as a 'testing ground' and for evaluation purposes; this includes a user study for the workflow and the AutArch software.