EXCEPTIUS Corpus

DOI

EXCEPTIUS Corpus v1.0, containing the following data: - raw documents for 21 countries at national level - pre-processed data with spacy-udpipe v1.0 - automatically annotated documents for the identification of exceptional measures at sentence level

Country list (ISO 3166-1 alpha-2): AT, BE, HR, CY, CZ, DK, FR, DE, HU, IE, IT, LV, LT, NL, NO, PL, SI, SE, CH, UK

Folder structure: each country has a dedicated folder. Inside each folder you will find the following subfolders: - raw_text: the raw text data (.txt format)
- processed: the output of the spacy-udpipe v1.0 - each line is a sentence, containing the following info: tokens, lemma, POS, UD dependency relations - model: the predictions of the trained model (XML pre@36 as reported in Table 4 of the paper). Each line is a sentence, separate by 9 tab - each for a exceptional measure class. 1: signals presence of a class.

The Italy and Norway folder misses the predictions of the models.

Identifier
DOI https://doi.org/10.34894/ZUWAPS
Metadata Access https://dataverse.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.34894/ZUWAPS
Provenance
Creator Caselli, Tommaso ORCID logo; Egger, Clara; Tziafas, Georgios; De Saint-Phalle, Eugenie
Publisher DataverseNL
Contributor Caselli, Tommaso
Publication Year 2021
Funding Reference ZonMw, 10430032010026
Rights CC0 Waiver; info:eu-repo/semantics/openAccess; https://creativecommons.org/publicdomain/zero/1.0/
OpenAccess true
Contact Caselli, Tommaso (University of Groningen)
Representation
Resource Type legal texts; Dataset
Format application/vnd.openxmlformats-officedocument.wordprocessingml.document; application/zip; application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size 6820; 7681; 233842395; 9233
Version 1.0
Discipline Agriculture, Forestry, Horticulture, Aquaculture; Agriculture, Forestry, Horticulture, Aquaculture and Veterinary Medicine; Humanities; Jurisprudence; Law; Life Sciences; Social Sciences; Social and Behavioural Sciences; Soil Sciences