DoorDet: Semi-Automated Multi-Class Door Detection Dataset via Object Detection and Large Language Models

📅 2025-08-11

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

High-quality, publicly available datasets for fine-grained, multi-class door detection in architectural floor plans are scarce, hindering progress in building compliance verification and indoor scene understanding. Method: This paper proposes a semi-automatic data construction framework integrating object detection with large language models (LLMs). First, state-of-the-art object detectors precisely localize door instances. Second, a multimodal prompting strategy guides an LLM to perform fine-grained classification—leveraging both visual features and contextual semantics. Third, a lightweight human-in-the-loop verification module ensures label accuracy and consistency. Contribution/Results: The framework reduces manual annotation effort by ~60% while improving class consistency and structural annotation precision. We release DoorPlan—the first open-source, large-scale, fine-grained door detection dataset featuring eight functional door categories—establishing a new benchmark for regulatory compliance checking, indoor scene parsing, and training downstream neural networks.

Technology Category

Application Category

📝 Abstract

Accurate detection and classification of diverse door types in floor plans drawings is critical for multiple applications, such as building compliance checking, and indoor scene understanding. Despite their importance, publicly available datasets specifically designed for fine-grained multi-class door detection remain scarce. In this work, we present a semi-automated pipeline that leverages a state-of-the-art object detector and a large language model (LLM) to construct a multi-class door detection dataset with minimal manual effort. Doors are first detected as a unified category using a deep object detection model. Next, an LLM classifies each detected instance based on its visual and contextual features. Finally, a human-in-the-loop stage ensures high-quality labels and bounding boxes. Our method significantly reduces annotation cost while producing a dataset suitable for benchmarking neural models in floor plan analysis. This work demonstrates the potential of combining deep learning and multimodal reasoning for efficient dataset construction in complex real-world domains.

Problem

Research questions and friction points this paper is trying to address.

Detect and classify diverse door types in floor plans

Address scarcity of multi-class door detection datasets

Reduce annotation cost via semi-automated pipeline

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages object detector for unified door detection

Uses LLM for multi-class door classification

Incorporates human-in-the-loop for quality assurance

🔎 Similar Papers

No similar papers found.