Sagedussõnastik Estonian Frequency Dictionary

PID

Sagedusloendid, mis on tehtud 0,5 miljoni sõnaga ilukirjanduse korpuse baasil (aastatest 1992-1998) ja 0,5 miljoni sõnaga ajakirjanduse korpuse baasil (1995-1999). Kolm sagedusloendit sõnade ja nende sagedustega alamkorpustest ning koondkorpuses 10 000 lemmat (sõnaliikidega) 1000 sagedasemat sõnavormi, 100 sõna, mis on iseloomulikud ainult ühele allkorpusele, kuid puuduvad teises.

Frequency lists based on 0.5 million words of fiction texts (representing years 1992-1998), and 0.5 million words newspaper texts (from years 1995-1999). Three frequency lists, with words and their frequencies in the sub-corpora and in the whole corpus: 10 000 lemmas (includes also POS) 1000 most frequent word forms 100 words representing only one of the sub-corpora - words that counted as frequent in one of the sub-corpora, but were missing in the other.

Identifier
PID http://hdl.handle.net/11297/1-00-0000-0000-0000-0002-C
Metadata Access https://metashare.ut.ee/oai_pmh/?verb=GetRecord&metadataPrefix=olac&identifier=a8bd02c25b2711e2a6e4005056b40024cb25f09c5d5442eca755d617c46060c8
Provenance
Publisher CLARIN
Contributor Kadri Muischnek, korpus.info[at]ut.ee
Publication Year 2022
Rights CC-BY Restrictions of Use: attribution
OpenAccess true
Contact info(at)keeleressursid.ee
Representation
Language Estonian
Resource Type Text
Size 10000 words
Discipline Linguistics