🤖 AI Summary
This work addresses the lifelong text provenance challenge posed by the continual emergence of large language models—specifically, maintaining the ability to identify previously seen generators while incorporating new ones. To this end, the authors propose RidgeFT, a framework that employs a task-aware encoder to extract and freeze features, combined with class-level sufficient statistics to enable efficient, replay-free model updates via closed-form ridge regression. The approach innovatively integrates covariance calibration and fixed random features to enhance representation capacity. Extensive experiments demonstrate that RidgeFT achieves state-of-the-art macro F1 scores across diverse settings, including multiple topics, backbone architectures, and incremental learning scenarios, significantly improving simultaneous recognition performance for both old and new text generators.
📝 Abstract
Machine-generated text (MGT) attribution aims to identify the specific generator responsible for a given text, thereby providing fine-grained evidence for model accountability and misuse investigation. As new large language models continue to emerge, attribution models must continuously incorporate new generators while preserving their ability to recognize previously seen ones. Prior works have shown that this lifelong MGT attribution setting is challenging, and existing methods often struggle to achieve a stable balance between adapting to new classes and retaining old ones. To address this issue, we propose RidgeFT, a lightweight analytic update framework that does not rely on exemplar replay. RidgeFT trains a task-aware encoder on the initial generator set, stores compact class-wise sufficient statistics when each generator class is first observed, and then freezes the encoder for replay-free closed-form updates. It then suppresses generator-irrelevant variation through covariance calibration, improves representation capacity with fixed random features, and updates new classes through closed-form ridge regression based on class-level sufficient statistics. Across multi-topic evaluations with varying initial generator setups, RidgeFT consistently outperforms baselines. It achieves the best macro-F1 across domains, backbones, and incremental protocols, while also improving both old-class retention and new-class adaptation. These results suggest that feature-stable analytic updates provide a simple yet effective approach to lifelong MGT attribution.