A Resource for Evaluating Graded Word Similarity in Context: CoSimLex

PID

The dataset contains human similarity ratings for pairs of words. The annotators were presented with contexts that contained both of the words in the pair and the dataset features two different contexts per pair. The words were sourced from the English, Croatian, Finnish and Slovenian versions of the original Simlex dataset.

Identifier
PID http://hdl.handle.net/11356/1308
Related Identifier https://arxiv.org/abs/1912.05320
Related Identifier https://www.aclweb.org/anthology/2020.lrec-1.720/
Related Identifier http://embeddia.eu/
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1308
Provenance
Creator Armendariz, Carlos; Matthew, Purver; Ulčar, Matej; Pollak, Senja; Ljubešić, Nikola; Robnik-Šikonja, Marko; Granroth-Wilding, Mark; Vaik, Kristiina
Publisher Queen Mary University
Publication Year 2020
Funding Reference info:eu-repo/grantAgreement/EC/H2020/825153
Rights GNU General Public Licence, version 3; https://opensource.org/licenses/GPL-3.0; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language English; Croatian; Finnish; Slovenian; Slovene
Resource Type lexicalConceptualResource
Format text/csv; application/octet-stream; text/plain; charset=utf-8; downloadable_files_count: 5
Discipline Linguistics