Frequency lists of word-level n-grams (or word sets) were extracted from the Trendi Monitor Corpus of Slovene (version 2022-05: http://hdl.handle.net/11356/1590) using the LIST corpus extraction tool (http://hdl.handle.net/11356/1227). The lists contain all word-level 2-, 3-, 4- and 5-grams with minimum relative frequency of 2 per million occurring in the corpus in texts published in 2021, along with their absolute and relative frequencies and percentages.
The n-grams were extracted from lower-case word forms along with lemmas and morphosyntactic tags.
For frequency lists of n-grams extracted from texts from previous years (e.g. 2019 and 2020), please refer to earlier versions of this entry.