Due to the increasing demand for search and rescue of people at sea, our requirements for the performance of the object detection of high-definition pictures taken by drones have also become higher. Since the target is very small, it is very challenging. Most of detectors are based on feature pyramid network (FPN) to enrich the common features of shallow features by combining deep context. However, because small targets are at the low level of FPN, the semantic features are very small, and the texture features are not obvious. Therefore, this paper proposes a texture-enhanced feature fusion network (TEFF) to solve this problem. Our model is mainly composed of three parts: channel attention module (CAM), texture enhancement (TE) and feature fusion (FF). Among them, CAM mainly completes two parts. The first is to generate feature heat maps to enhance features, and the other is to generate adaptive parameters for hierarchical fusion. TE and FF are mainly for feature enhancement and adaptive fusion between levels. Specifically, the whole functional level is enhanced and feature fusion is carried out through the self-attention mechanism. Our network has achieved good results on multiple backbones.
Hao LiuChengming LuoZhiqiang HuangZicheng Dou
Xin WangZhang Hong-yanQianhe LiuWei Gong
Haotian LiKezheng LinJing‐Xuan BaiAo LiJiali Yu
Qi ZhangHongying ZhangXiuwen Lu
Yanshan LiFuxing LiuYusong QinLinhui DaiLi ZhangWeixin Xie