-
Corpus of the Contemporary Lithuanian Language
Corpus of the Contemporary Lithuanian Language, which comprises 208 million words, is a collection of texts designed to represent the current Lithuanian. The corpus has been... -
Lithuanian morphologically annotated corpus - MATAS v1.0
MATAS corpus (version 1.0) DESCRIPTION Manually checked, morphologically annotated corpus MATAS FORMATS 1. CoNLL-U (CONLLU, conllu) 2. SketchEngine - tab delimited word per... -
DELFI.lt corpus
DELFI.lt is corpus made of articles published by news portal DELFI.lt since March 2014 till November 2016. Metadata was collected with articles as well: author, title, date,... -
Lithuanian morphologically annotated corpus - MATAS
MATAS v0.2 - Morphologically Annotated Lithuanian Corpus (manually checked) Contains 4 parts: Documents (21%), Fiction (19%), Periodicals (36%), Scientific texts (24%) Wordform...