-
DutchParliament dataset annotated with coreferrence links.
A dataset of 74 documents containing records of parliamentary proceedings from the Dutch Tweede kamer between 2015 and 2020. The data has been manually annotated with... -
HyperCoref Corpus Seed Pages
Archive containing the seed URLs for recreating the "HyperCoref" corpus, an automatically extracted corpus of cross-document event coreference links in online news. Further... -
Training corpus SUK 1.0
The SUK training corpus contains about 1 million tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, and lemmatisation, with... -
PyTorch model for Slovenian Coreference Resolution
Slovenian model for coreference resolution: a neural network based on a customized transformer architecture, usable with the code published on... -
Slovene coreference resolution corpus coref149
This corpus contains a subset of the ssj500k v1.4 corpus, http://hdl.handle.net/11356/1052. Each of 149 documents contains a paragraph from ssj500k that contains at least 100... -
Slovene corpus for aspect-based sentiment analysis - SentiCoref 1.0
SentiCoref 1.0 corpus consists of 837 documents selected from SentiNews 1.0 corpus (http://hdl.handle.net/11356/1110). The documents were selected based on the number of... -
DiscoMT 2015 Shared Task on Pronoun Translation
The data set includes training, development and test data from the shared tasks on pronoun-focused machine translation and cross-lingual pronoun prediction from the EMNLP 2015... -
Evaluation of neural coreference annotation of simplified German
This poster presents our evaluation of a neural coreference resolver (Schröder et al. 2021) on simplified German texts as well as the results of an annotation study that we...