Semi-Supervised Crowd Counting via Swin Transformer with Adaptive Soft Threshold and Contrastive Learning

Mingwei Yao; Kehua Guo; Lingyan Zhang; Xuyang Tan; Xiaokang Zhou

doi:10.1145/3744747

ScienceGate Book Chapters

JOURNAL ARTICLE

Semi-Supervised Crowd Counting via Swin Transformer with Adaptive Soft Threshold and Contrastive Learning

Mingwei Yao Kehua Guo Lingyan Zhang Xuyang Tan Xiaokang Zhou

Year: 2025 Journal: ACM Transactions on Intelligent Systems and Technology Publisher: Association for Computing Machinery

DOI: 10.1145/3744747

Get Full-Text PDF Get Analytical Report

Abstract

Manual annotation for crowd counting remains labor-intensive and costly. Although existing semi-supervised methods partially alleviate this burden, they still face significant challenges regarding the quality of generated pseudo-labels and the utilization of unlabeled data. To address these issues, we propose a novel semi-supervised crowd counting framework, called Point-Adaptive Teacher (PAT). This framework integrates Adaptive Soft Threshold (AST) and contrastive learning to enhance pseudo-label quality and effectively leverage unlabeled data. Specifically, we employ the Swin Transformer as the backbone and develop Swin-P2PNet, which captures global contextual information through hierarchical window attention, improving the accuracy of pseudo-labels. Additionally, we design the AST that dynamically adjusts the sample loss weight by combining confidence and uncertainty predictions, thereby alleviating the effect of noise in pseudo-labels. Finally, we introduce a contrastive learning strategy requiring no extra parameters. This strategy enhances the model's ability to learn latent representations from unlabeled data. Extensive experiments have been conducted on three public datasets, namely ShanghaiTech, JHU-Crowd++, and UCF-QNRF. The results demonstrate that our method achieves performance comparable to state-of-the-art methods.

Keywords:

Computer science Transformer Artificial intelligence Machine learning Natural language processing Electrical engineering

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.18

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Video Surveillance and Tracking Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Anomaly Detection Techniques and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Fire Detection and Safety Systems

Physical Sciences → Engineering → Safety, Risk, Reliability and Quality

Semi-Supervised Crowd Counting via Swin Transformer with Adaptive Soft Threshold and Contrastive Learning

Abstract

Metrics

Topics

Related Documents

Weakly supervised crowd counting based on Swin Transformer

CCST: crowd counting with swin transformer

Multi-level Convolutional Transformer with Adaptive Ranking for Semi-supervised Crowd Counting

Semi-Supervised Crowd Counting via Multiple Representation Learning

Boosting Semi-supervised Crowd Counting with Scale-based Active Learning