🤖 AI Summary
Gradient Boosting Decision Trees (GBDTs) lack robust watermarking techniques for intellectual property protection. Method: This paper proposes the first in-situ robust watermarking framework specifically designed for GBDTs. It embeds watermarks by leveraging tree-structure-aware node-split path encoding and imperceptible weight perturbations—without altering the model architecture—and introduces four embedding strategies to balance fidelity and robustness. Contribution/Results: The method maintains high model accuracy (accuracy degradation <1%) while significantly enhancing resilience against post-deployment fine-tuning and other adversarial attacks. Extensive experiments on multiple benchmark datasets demonstrate high watermark extraction rates (>98%) and strong robustness under diverse perturbations, effectively addressing a critical gap in GBDT copyright protection. This work establishes a novel paradigm for securing traditional machine learning models, advancing practical IP protection for non-neural ML systems.
📝 Abstract
Gradient Boosting Decision Trees (GBDTs) are widely used in industry and academia for their high accuracy and efficiency, particularly on structured data. However, watermarking GBDT models remains underexplored compared to neural networks. In this work, we present the first robust watermarking framework tailored to GBDT models, utilizing in-place fine-tuning to embed imperceptible and resilient watermarks. We propose four embedding strategies, each designed to minimize impact on model accuracy while ensuring watermark robustness. Through experiments across diverse datasets, we demonstrate that our methods achieve high watermark embedding rates, low accuracy degradation, and strong resistance to post-deployment fine-tuning.