Frequency list of words from the Trendi corpus 2020

Dataset

PID

This frequency list of words was prepared by extracting words (i.e. lemmas with their lexical features) from the Trendi Monitor Corpus of Slovene (http://hdl.handle.net/11356/1590) covering the period between 1 January 2020 and 31 December 2020 using the LIST corpus extraction tool (http://hdl.handle.net/11356/1227). The Trendi frequency list was then compared to the frequency list of words from the Gigafida 2.0 Corpus of Slovene (http://hdl.handle.net/11356/1320), which covers the period between 1991 and 2018, and the frequency list of words from Trendi for 2019. The words were compared using the simple maths formula implemented by SketchEngine (see https://www.sketchengine.eu/documentation/simple-maths/).

The final list contains lemmas, their lexical features, their absolute and relative frequencies from the first (1991–2019) and second periods (2020), and the simple maths value indicating if the word is more frequent in 2020 (simple maths > 1.00) or in 1991–2019 (simple maths < 1.00).

For frequency lists of words that are typical of previous years according to the simple maths measure (e.g. 2019 vs. 1991-2018), please refer to earlier versions of this entry.

Identifier
PID	http://hdl.handle.net/11356/1705
Related Identifier	http://hdl.handle.net/11356/1701
Related Identifier	http://hdl.handle.net/11356/1712
Related Identifier	https://sled.ijs.si/
Metadata Access	http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1705

Provenance
Creator	Čibej, Jaka; Kosem, Iztok
Publisher	Jožef Stefan Institute
Publication Year	2022
Rights	Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0); https://creativecommons.org/licenses/by-sa/4.0/; PUB
OpenAccess	true
Contact	info(at)clarin.si

Representation
Language	Slovenian; Slovene
Resource Type	lexicalConceptualResource
Format	text/plain; charset=utf-8; application/zip; downloadable_files_count: 1
Discipline	Linguistics