Structured Adversarial Self-Supervised Learning for Robust Object Detection in Remote Sensing Images

Cong Zhang; Kin‐Man Lam; Tianshan Liu; Yui‐Lam Chan; Qi Wang

doi:10.1109/tgrs.2024.3375398

ScienceGate Book Chapters

JOURNAL ARTICLE

Structured Adversarial Self-Supervised Learning for Robust Object Detection in Remote Sensing Images

Cong Zhang Kin‐Man Lam Tianshan Liu Yui‐Lam Chan Qi Wang

Year: 2024 Journal: IEEE Transactions on Geoscience and Remote Sensing Vol: 62 Pages: 1-20 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/tgrs.2024.3375398

Get Full-Text PDF Get Analytical Report

Abstract

Object detection plays a crucial role in scene understanding and has extensive practical applications. In the field of remote sensing object detection, both detection accuracy and robustness are of significant concern. Existing methods heavily rely on sophisticated adversarial training strategies that tend to improve robustness at the expense of accuracy. However, detection robustness is not always indicative of improved accuracy. Therefore, in this paper, we research how to enhance robustness, while still preserving high accuracy, or even improve both simultaneously, with simple vanilla adversarial training or even in the absence thereof. In pursuit of a solution, we first conduct an exploratory investigation by shifting our attention from adversarial training, referred to as adversarial fine-tuning, to adversarial pretraining. Specifically, we propose a novel pretraining paradigm, namely structured adversarial self-supervised (SASS) pretraining, to strengthen both clean accuracy and adversarial robustness for object detection in remote sensing images. At a high level, SASS pretraining aims to unify adversarial learning and self-supervised learning into pretraining and encode structured knowledge into pretrained representations for powerful transferability to downstream detection. Moreover, to fully explore the inherent robustness of vision Transformers and facilitate their pretraining efficiency, by leveraging the recent masked image modeling (MIM) as the pretext task, we further instantiate SASS pretraining into a concise end-to-end framework, named structured adversarial MIM (SA-MIM). SA-MIM consists of two pivotal components, structured adversarial attack and structured MIM (S-MIM). The former establishes structured adversaries for the context of adversarial pretraining, while the latter introduces a structured local-sampling global-masking strategy to adapt to hierarchical encoder architectures. Comprehensive experiments on three different datasets have demonstrated the significant superiority of the proposed pretraining paradigm over previous counterparts for remote sensing object detection. More importantly, regardless of with or without adversarial fine-tuning, it enables simultaneous improvements on detection accuracy and robustness as expected, promisingly alleviating the dependence on complicated adversarial fine-tuning.

Keywords:

Computer science Object detection Artificial intelligence Adversarial system Remote sensing Computer vision Object (grammar) Pattern recognition (psychology) Machine learning Geology

Metrics

Cited By

28.29

FWCI (Field Weighted Citation Impact)

115

Refs

0.99

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Remote-Sensing Image Classification

Physical Sciences → Engineering → Media Technology

Infrared Target Detection Methodologies

Physical Sciences → Engineering → Aerospace Engineering

Advanced Image Fusion Techniques

Physical Sciences → Engineering → Media Technology

Structured Adversarial Self-Supervised Learning for Robust Object Detection in Remote Sensing Images

Abstract

Metrics

Citation History

Topics

Related Documents

SAENet: Self-Supervised Adversarial and Equivariant Network for Weakly Supervised Object Detection in Remote Sensing Images

Robust Remote Sensing Scene Classification by Adversarial Self-Supervised Learning

Semi-Supervised Object Detection in Remote Sensing Images Using Generative Adversarial Networks

More Accurate Constraints for Self-Supervised Learning in Remote Sensing Images-Based Object Detection

Object Detection for Optical Remote Sensing Images with Self-supervised Feature Representation