M-Stance: A Multi-Target, Multilingual and Multi-Cultural Stance Detection Dataset towards EU Refugee Crisis

Dataset

M-STANCE is a multilingual, multi-target and multi-cultural stance detection (SD) dataset. It covers social media posts from 2014 and 2019 related to the migration crisis in EU and covers four languages: English, German, Italian and Polish. In this dataset, we first propose a list of broad targets, which are further subcategorized into fine-grained targets. The fine-grained targets cover the fine-grained aspects of each broad targets. For instance, "refugees" is the fine-grained target of broad target "migrants". The targets of the dataset is annotated by LLM and the stance is annotated by humans.

We also conducted cross-cultural annotation between polish and English-German on both directions. In cross-cultural annotation, the posts in the source language are translated to the target language and annotated by speakers of the target language. The resulting cross-culturally annotated dataset can be found under the directory x-culture.

Identifier
Source	https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/4480
Metadata Access	https://tudatalib.ulb.tu-darmstadt.de/oai/openairedata?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:tudatalib.ulb.tu-darmstadt.de:tudatalib/4480

Provenance
Creator	Zhifan, Sun; Andreas, Waldis; Yongxin, Huang; Anamaria, Segesten; Iryna, Gurevych
Publisher	TU Darmstadt
Contributor	TU Darmstadt
Publication Year	2025
Rights	Creative Commons Attribution 4.0; info:eu-repo/semantics/openAccess
OpenAccess	true
Contact	https://tudatalib.ulb.tu-darmstadt.de/page/contact

Representation
Language	English
Resource Type	Dataset
Format	application/zip
Version	v1.0.0
Discipline	Other