ORVELIT v3 (Lith.Originalios ir Vertimų Lietuvių Kalbos Tekstynas) is a comparable monolingual corpus of original and translated Lithuanian consisting of four sub-corpora of original and translated fiction and popular science literature (approx. 1m words each). A detailed information on the composition and lexical and morphological features of the raw (ORVELIT v1) and morphologically annotated (ORVELIT v2) versions of the corpus can be found in:
Vaičenonienė, Jurgita, Kovalevskaitė, Jolanta, and Ringailienė, Teresė. 2017. Tekstynais paremti vertimų kalbos tyrimai ir šaltiniai. Kalbų studijos/ Studies about Languages, Nr. 30, pp. 42-55. https://www.vdu.lt/cris/handle/20.500.12259/56648?mode=simple
Vaičenonienė, Jurgita, Kovalevskaitė, Jolanta. 2019. Leksinės ir morfologinės vertimų kalbos ypatybės. Darnioji daugiakalbystė/ Sustainable Multilingualism Nr. 14, pp. 208-235.
https://www.vdu.lt/cris/handle/20.500.12259/98861
ORVELIT v3 has been modified by deleting the title, content, bibliographical lists, indexes and author(s) of the texts as well as mixing the individual texts at paragraph level. Cases when some other information was deleted were marked as . The corpus encoding is UTF-8. ORVELIT v3 includes a raw (ORVELIT v3_raw) and morphologically annotated (ORVELIT v3_annotated) corpus versions. The corpus was automatically morphologically annotated with Semantika.lt analyser.