🤖 AI Summary
To address the bottleneck in document layout analysis (DLA) cross-domain transfer—where existing methods require access to source-domain data and target-domain labels, rendering them unsuitable for privacy-sensitive domains (e.g., finance, healthcare)—this paper proposes DLAdapter, the first fully source-free unsupervised domain adaptation framework for DLA. Methodologically, it introduces a novel feature disentanglement mechanism that jointly incorporates geometric priors and semantic context, coupled with pseudo-label refinement and layout-structure consistency constraints to achieve content-structure co-alignment. We establish the first Source-Free DLA (SFDLA) benchmark and release it publicly. On the PubLayNet→DocLayNet transfer task, DLAdapter achieves a 4.21% absolute improvement over the source model and outperforms prior source-free methods by 2.26%. Crucially, it substantially mitigates performance collapse under zero-source-data and zero-target-label conditions, reducing average degradation from −32.64% observed in baselines.
📝 Abstract
Document Layout Analysis (DLA) is a fundamental task in document understanding. However, existing DLA and adaptation methods often require access to large-scale source data and target labels. This requirements severely limiting their real-world applicability, particularly in privacy-sensitive and resource-constrained domains, such as financial statements, medical records, and proprietary business documents. According to our observation, directly transferring source-domain fine-tuned models on target domains often results in a significant performance drop (Avg. -32.64%). In this work, we introduce Source-Free Document Layout Analysis (SFDLA), aiming for adapting a pre-trained source DLA models to an unlabeled target domain, without access to any source data. To address this challenge, we establish the first SFDLA benchmark, covering three major DLA datasets for geometric- and content-aware adaptation. Furthermore, we propose Document Layout Analysis Adapter (DLAdapter), a novel framework that is designed to improve source-free adaptation across document domains. Our method achieves a +4.21% improvement over the source-only baseline and a +2.26% gain over existing source-free methods from PubLayNet to DocLayNet. We believe this work will inspire the DLA community to further investigate source-free document understanding. To support future research of the community, the benchmark, models, and code will be publicly available at https://github.com/s3setewe/sfdla-DLAdapter.