Papers
arxiv:2211.05631

Backdoor Defense via Suppressing Model Shortcuts

Published on Nov 2, 2022
Authors:
,
,

Abstract

Suppressing key skip connections in deep neural networks reduces backdoor attack success rates and improves benign accuracy.

AI-generated summary

Recent studies have demonstrated that deep neural networks (DNNs) are vulnerable to backdoor attacks during the training process. Specifically, the adversaries intend to embed hidden backdoors in DNNs so that malicious model predictions can be activated through pre-defined trigger patterns. In this paper, we explore the backdoor mechanism from the angle of the model structure. We select the skip connection for discussions, inspired by the understanding that it helps the learning of model `shortcuts' where backdoor triggers are usually easier to be learned. Specifically, we demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections. Based on this observation, we design a simple yet effective backdoor removal method by suppressing the skip connections in critical layers selected by our method. We also implement fine-tuning on these layers to recover high benign accuracy and to further reduce ASR. Extensive experiments on benchmark datasets verify the effectiveness of our method.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2211.05631 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2211.05631 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2211.05631 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.