๐ค AI Summary
AI-generated text detection suffers from poor generalization in cross-domain settings, struggling to adapt to unseen generative models and textual domains. Method: This paper proposes a dynamic weighted ensemble framework comprising (i) a parallel multi-expert detector architecture, (ii) soft-weight ensemble guided by a domain classifier, (iii) contrastive learning-based domain representation alignment, and (iv) a lightweight differentiable domain gating moduleโenabling zero-shot domain adaptation without target-domain labels for the first time. Contribution/Results: Compared to static ensemble methods, our approach significantly improves cross-domain robustness and parameter efficiency: it achieves state-of-the-art in-domain detection performance across multiple benchmarks and outperforms baseline models with twice the parameter count on cross-domain evaluation. The code and models are publicly available.
๐ Abstract
As state-of-the-art language models continue to improve, the need for robust detection of machine-generated text becomes increasingly critical. However, current state-of-the-art machine text detectors struggle to adapt to new unseen domains and generative models. In this paper we present DoGEN (Domain Gating Ensemble Networks), a technique that allows detectors to adapt to unseen domains by ensembling a set of domain expert detector models using weights from a domain classifier. We test DoGEN on a wide variety of domains from leading benchmarks and find that it achieves state-of-the-art performance on in-domain detection while outperforming models twice its size on out-of-domain detection. We release our code and trained models to assist in future research in domain-adaptive AI detection.