DISSERTATION

Cross Target Generalization for Stance Detection

Conforti, Costanza

Year: 2022 University:   Apollo (University of Cambridge)   Publisher: University of Cambridge

Abstract

Stance detection is a popular NLP task which consists in automatically inferring the opinion expressed in a text with respect to a given target. Cross-target generalization is a known problem in stance detection, where systems tend to perform poorly when exposed to targets unseen during training. Given that data annotation is expensive and time-consuming, finding ways to leverage other sources of knowledge to improve cross-target stance detection can offer great benefits. In this thesis, I suggest to improve the robustness of cross-target stance detection in three settings.First, I explore weak supervision through synthetically annotated samples as a means to provide knowledge about unseen targets to a stance detection system. To this end, I design a simple and inexpensive framework and show experimentally that integrating synthetic data is helpful for cross-target generalization. Secondly, I investigate cross-genre stance detection, where knowledge from annotated tweets is leveraged to improve news stance detection on targets unseen during training. Due to their peculiar stylistic characteristics, transferring knowledge between samples belonging to different genres is non-trivial. To allow the model to capture the useful stance-specific features, I propose to treat the task adversarially. Thirdly, I study multi-modality as a means to enhance cross-target generalization. Specifically, I design a robust multi-task BERT-based architecture that combines textual input with high-frequency intra-day time series from stock market prices. I show experimentally and through detailed result analysis that the proposed system benefits from financial information, and achieves state-of-the-art results: this demonstrates that the combination of multiple input signals is effective for cross-target stance detection, and opens interesting research directions for future work. In addition, I created the first multi-task, multi-genre and multi-modal resource for stance detection. It provides two aligned textual signals, composed of carefully selected and expert-annotated tweets and news articles; moreover, it contains aligned financial signal in the form of fine-grained intra-day stock market prices variations. This large and integrated resource provides a comprehensive framework for robust training and fair model evaluation of the above-mentioned algorithms. I released the entire resource for future research.

Keywords:
Leverage (statistics) Robustness (evolution) Generalization Annotation Task (project management) Domain knowledge Labeled data

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Sentiment Analysis and Opinion Mining
Physical Sciences →  Computer Science →  Artificial Intelligence
Data-Driven Disease Surveillance
Health Sciences →  Medicine →  Epidemiology
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.