-
Lithuanian Parliament Corpus for Authorship Attribution
23.9 m word Lithuanian Parliament corpus is specially designed for authorship attribution task. The corpus consists of 111 thousand samples of speech transcripts by 147... -
Lithuanian Treebank ALKSNIS
ALKSNIS v2.1 ALKSNIS v2.1 consists of 2,355 syntactically annotated sentences in the PML (Prague Mark-up Language) format. The format allows researchers to visualise and edit... -
Lithuanian 4-gram dataset
Dataset of 4-grams with frequencies extracted from Delfi.lt corpus (~ 70 million words, period: March 2014 - November 2016). Firstly corpus was split into sentences, then symbol... -
Lithuanian Spelling Checker V.1.0.45 for LibreOffice and OpenOffice
Lithuanian spelling checker for LIBREOFFICE / OPENOFFICE 2020-04-09 version 1.0.45 -
English-Lithuanian Comparable Cybersecurity Corpus - DVITAS
The English-Lithuanian comparable corpus (DVITAS COMPARABLE) is morphologically annotated. It includes English and Lithuanian original texts on cybersecurity from the time... -
English-French-Lithuanian Parallel Corpus of EU Financial Documents
The corpus is comprised of 154 EU legislative documents (English documents and their translations into French and Lithuanian) related to various financial issues and enacted in... -
LitLat BERT
Trilingual BERT-like (Bidirectional Encoder Representations from Transformers) model, trained on Lithuanian, Latvian, and English data. State of the art tool representing... -
Lithuanian morphologically annotated corpus - MATAS v3.0
MATAS corpus (version 3.0) DESCRIPTION Updated, manually checked, morphologically annotated corpus MATAS LANGUAGE Lithuanian PREVIOUS VERSIONS 1. MATAS v0.2... -
Wordlist of the Contemporary Corpus of Lithuanian Language in the Face of War...
We present the comparative wordlist based on the Corpus of the Contemporary Lithuanian Language (CCLL2 version 2, pre-2020), supplemented by the media (courtesy of the news... -
Eesti Keele Instituudi reeglipõhise morfoloogia tööriistad Tools of the IEL ...
Eesti Keele Instituudi reeglipõhine morfoloogiatööriistade komplekt sisaldab endas eraldi kasutatavaid mooduleid silbitamise, tüübituvastuse, morfoloogilise analüüsi ja sünteesi... -
Eesti-inglise paralleelkorpus Estonian-English parallel corpus
korpus More info at http://www.cl.ut.ee/korpused/paralleel/index.php?lang=en Annotated and sentence-aligned parallel text corpus; contains: 1. Estonian laws and their... -
Eesti keele segakorpus: Seadused Corpus of Estonian law texts
Eesti ja Euroopa seadusetekstide korpus. TEI P5 XML märgendus, UTF8 kodeering. More info at http://www.cl.ut.ee/korpused/segakorpus/seadused/ Corpus of law texts in Estonian,... -
Pindsüntaktiliselt analüüsitud korpus Estonian corpus with shallow syntactic...
This corpus is a monolingual corpus with Constraint Grammar-style shallow syntactic annotations. -
Morfoloogiliselt ühestatud korpus Corpus of morphologically disambiguated Es...
Käsitis morfoloogiliselt ühestatud korpus More info at http://www.cl.ut.ee/korpused/morfkorpus/index.php?lang=en Manually annotated corpus. Available for download and via Korp... -
Estonian WordNet (kb65a-4)
Compiled manually according to EuroWordNet project. More info at http://www.cl.ut.ee/ressursid/teksaurus -
Eesti keele ühendkorpus 2023 (annoteerimata) Estonian National Corpus 2023 (...
kirjeldus Estonian corpus of written texts. Consists of the Estonian Reference Corpus (90s–2008), Contemporary and old literature, Estonian Web (2013, 2017, 2019, 2021, 2023),... -
Estonian Wordnet (kb69a)
The atom of a wordnet-type thesaurus is a synonym set (also called a synset), which is a set containing all the synonymous words or multi-word units that express the same... -
Sagedussõnastik Estonian Frequency Dictionary
Sagedusloendid, mis on tehtud 0,5 miljoni sõnaga ilukirjanduse korpuse baasil (aastatest 1992-1998) ja 0,5 miljoni sõnaga ajakirjanduse korpuse baasil (1995-1999). Kolm... -
Eesti ajakirjanduse korpus Corpus of Estonian newspaper texts
Korpus sisaldab eesti ajalehti, 182 miljonit sõna. TEI P5 XML märgendus, UTF8 kodeering. More info at http://www.cl.ut.ee/korpused/ Corpus of Estonian newspaper texts, 182... -
Eesti murdekorpus Estonian Dialect Corpus
korpus More info at http://www.murre.ut.ee/estonian-dialect-corpus/ The dialect corpus consists of: 1) Dialect recordings. The corpus is based on dialect recordings which...