EmoTwi50 [research data]

Dataset

DOI

The dataset is a TSV (tab-separated) with five columns: the first two columns represent the codes of the pair of emojis evaluated, the third column their gold standard similarity, the fourth column their gold standard relatedness and the fifth column the average of the previous two values. Each row of the file represents the gold standard evaluation results of a pair of emojis. Remember that in order to retrieve the vectorial embedding corresponding to an emoji in our models, you need to add the token "eoji" before the emoji code.

Identifier
DOI	https://doi.org/10.34810/data483
Metadata Access	https://dataverse.csuc.cat/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.34810/data483

Provenance
Creator	Barbieri, Francesco; Ronzano, Francesco ; Saggion, Horacio
Publisher	CORA.Repositori de Dades de Recerca
Publication Year	2023
Funding Reference	European Commission 611383 ; Ministerio de Economía y Competitividad MDM-2015-0502
Rights	Custom Dataset Terms; info:eu-repo/semantics/openAccess; https://dataverse.csuc.cat/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.34810/data483
OpenAccess	true

Representation
Resource Type	Aggregate data; Dataset
Format	text/tab-separated-values; text/plain
Size	1576; 1214
Version	1.0
Discipline	Other