Abstract
Federated learning (FL) enables collaborative model training without exposing local data, offering privacy benefits. However, its distributed nature makes it vulnerable to backdoor attacks, where adversaries manipulate training data or model updates to trigger attacker-chosen outputs. Existing defenses often fail under high proportions of malicious clients and struggle to balance robustness and model utility. This article proposes FedRAB, a collaborative FL defense framework using dynamic smoothing to mitigate backdoor threats. FedRAB remains effective even when over 50% of clients are malicious. In this framework, clients are categorized into three types: fully trusted clients, malicious but trusted clients and malicious and untrusted clients. The first two inject controlled perturbation noise into their local datasets, suppressing poisoning attacks while preserving accuracy. To address diverse and severe backdoor behaviors, the server applies dimensionality reduction followed by clustering to identify and filter out the most harmful malicious updates. This enhances both the accuracy and efficiency of malicious update detection. The server then clips and perturbs the remaining model updates, further strengthening defense against backdoors while preserving data diversity and generalization. We evaluate the effectiveness of FedRAB on various datasets. For example, on the MNIST dataset, when 65% of clients are malicious, FedRAB reduces the backdoor accuracy from 94.6% to 1.5%, while only decreasing the model's accuracy on benign samples by 1.2%.
Keywords
Get full access to this article
View all access options for this article.
