Dataset abstract:
The dataset includes an annotated corpus sample of N = 2000 French sentences with se mettre à or commencer à (1000 observations of each verb). The sample was drawn from the literary corpus Frantext and the journalistic corpus Le Monde (1000 observations from both corpora). The sample is balanced for verb as well as corpus, so we have 500 observations for each Verb-Corpus combination. The data is annotated for 3 variables: Source (corpus), Verb, collexeme.
Article abstract:
This paper examines the semantic value of the infinitive in the ingressive constructions se mettre à (SMA) and commencer à (COMA) using a distinctive collexeme analysis. We find that the collexemes significant for the construction SMA are fairly homogeneous across the different corpora and can be grouped into the general category of expressive collexemes. The collexemes significant for COMA are more heterogeneous and belong to the category of cognitive collexemes and to semantic fields of sensory and creative acts. The results are compatible with the hypothesis put forward by Verroens and De Cuypere (2023) stating that the overall meaning of the SMA construction is intrinsically punctual. The punctual value of SMA is not only compatible with expressive collexemes, but, moreover, emphasizes their unforeseen and unintentional meaning. Conversely, the incremental value of COMA is consistent with the gradual onset of cognitive and sensory collexemes.
Verroens, F., & De Cuypere, L. (2023). French ingressives and (phasal) aspect: A frame-semantic corpus-based analysis. Canadian Journal of Linguistics/Revue Canadienne de Linguistique, 68(3), 435-461. doi:10.1017/cnj.2023.19
PerlClx, 1.0b
MS Excel, Microsoft Office Professional Plus 2016