Replication Data for: The many guises of productivity: a case-study of Spanish inchoative constructions

DOI

The dataset contains the quantitative data used as input for the Principal Components Analysis conducted in the article "The many guises of productivity: a case-study of Spanish inchoative constructions". The data originates from the Spanish Web Corpus (esTenTen18), accessed via Sketch Engine (Kilgariff & Renau 2013). Only the subcorpus for European Spanish Data was selected. After downloading, the samples were manually cleaned. In the dataset, maximally 500 tokens were retained per auxiliary. The data were annotated for 'Subject', 'AUX', 'Filler', 'Person', 'Tense', 'LexicalTypeInf', SyntaxInf, 'Intercalation', 'Intentionality', and 'Abruptness', besides other criteria that are not taken into account for this study. For this analysis, only the variables auxiliary, abbreviated as 'AUX' and infintive, abbreviated as 'INF' are taken into account. See data-specific sections below for more information about the variables.

Identifier
DOI https://doi.org/10.18710/5E8I0T
Metadata Access https://dataverse.no/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18710/5E8I0T
Provenance
Creator Van Hulle, Sven ORCID logo
Publisher DataverseNO
Contributor Van Hulle, Sven; Ghent University; Enghels, Renata; Lauwers, Peter; The Tromsø Repository of Language and Linguistics (TROLLing)
Publication Year 2024
Rights CC0 1.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/publicdomain/zero/1.0
OpenAccess true
Contact Van Hulle, Sven (Ghent University)
Representation
Resource Type annotated corpus data; Dataset
Format text/plain; text/comma-separated-values; type/x-r-syntax
Size 5473; 21761; 1877
Version 1.0
Discipline Humanities; Linguistics