Convergence of Byzantine-Resilient Gradient Tracking via Probabilistic Edge Dropout

📅 2026-03-31

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of Byzantine resilience in decentralized optimization, where malicious nodes can send arbitrary messages that compromise algorithmic convergence. The authors propose GT-PD, the first method to achieve Byzantine robustness under a fully decentralized setting while preserving compatibility with doubly stochastic mixing structures. GT-PD integrates self-centered projection clipping, a probabilistic edge-dropping mechanism based on dual-metric trust scores, and a leaky integrator to effectively suppress the accumulation of tracking errors. Experimental results on MNIST demonstrate that GT-PD consistently outperforms existing approaches against various attacks; notably, GT-PD-L improves accuracy by up to 4.3 percentage points over coordinate-wise trimmed mean and achieves linear convergence when Byzantine nodes are completely isolated.

📝 Abstract

We study distributed optimization over networks with Byzantine agents that may send arbitrary adversarial messages. We propose \emph{Gradient Tracking with Probabilistic Edge Dropout} (GT-PD), a stochastic gradient tracking method that preserves the convergence properties of gradient tracking under adversarial communication. GT-PD combines two complementary defense layers: a universal self-centered projection that clips each incoming message to a ball of radius $τ$ around the receiving agent, and a fully decentralized probabilistic dropout rule driven by a dual-metric trust score in the decision and tracking channels. This design bounds adversarial perturbations while preserving the doubly stochastic mixing structure, a property often lost under robust aggregation in decentralized settings. Under complete Byzantine isolation ($p_b=0$), GT-PD converges linearly to a neighborhood determined solely by stochastic gradient variance. For partial isolation ($p_b>0$), we introduce \emph{Gradient Tracking with Probabilistic Edge Dropout and Leaky Integration} (GT-PD-L), which uses a leaky integrator to control the accumulation of tracking errors caused by persistent perturbations and achieves linear convergence to a bounded neighborhood determined by the stochastic variance and the clipping-to-leak ratio. We further show that under two-tier dropout with $p_h=1$, isolating Byzantine agents introduces no additional variance into the honest consensus dynamics. Experiments on MNIST under Sign Flip, ALIE, and Inner Product Manipulation attacks show that GT-PD-L outperforms coordinate-wise trimmed mean by up to 4.3 percentage points under stealth attacks.

Problem

Research questions and friction points this paper is trying to address.

Byzantine resilience

gradient tracking

distributed optimization

adversarial communication

probabilistic edge dropout

Innovation

Methods, ideas, or system contributions that make the work stand out.

Byzantine-resilient optimization

gradient tracking

probabilistic edge dropout

decentralized learning

leaky integration

🔎 Similar Papers

No similar papers found.

Authors to Follow