-
Survey Data on Preferences of Lithuanian Cybersecurity Terminology
The data is provided in two files: one containing questionnaire-data and the other containing the respondentents' data. The questionnaire data is in a TXT file, which includes... -
TED-ELH Parallel Corpus
The corpus contains parallelly aligned scripts of TED Talks in English, Lithuanian, and Hebrew. It contains spoken language data. -
Lithuanian Treebank ALKSNIS (2019-10-24)
ALKSNIS v3.0. ALKSNIS v3,0 consists of 3,643 syntactically annotated sentences in the PML (Prague Mark-up Language) format. The format allows researchers to visualise and edit... -
JABLONSKIS tagset v2
JABLONSKIS VERSION 2 is a Lithuanian standard morphologiclal tagset that is based on the abbreviations of parts of speech and other grammatical categories commonly used in... -
ORVELIT v3
ORVELIT v3 (Lith.Originalios ir Vertimų Lietuvių Kalbos Tekstynas) is a comparable monolingual corpus of original and translated Lithuanian consisting of four sub-corpora of... -
Lithuanian morphologically annotated corpus - MATAS
MATAS v0.2 - Morphologically Annotated Lithuanian Corpus (manually checked) Contains 4 parts: Documents (21%), Fiction (19%), Periodicals (36%), Scientific texts (24%) Wordform... -
Lithuanian keyboard for macOS users
This keyboard driver allows easy access of the Lithuanian letters via conventional keyboard layout a.k.a. „Lithuanian letters instead of numbers“. Essential new feature of this... -
English-Lithuanian Parallel Cybersecurity Corpus - DVITAS v2.0
English-Lithuanian parallel corpus DVITAS v2 includes original English texts on cybersecurity and their Lithuanian translations aligned on the sentence level. Version 1 of the... -
DIGIRES COVID-19 ML Dataset v.1
DIGIRES COVID-19 ML dataset v.1 is a tab-separated (.tsv) file prepared for training machine learning algorithms. The training dataset was compiled from various internet public... -
Lemmatised Wordlist of 1 m. Corpus of Contemporary Lithuanian
The lemmatised wordlist of 1 m. word Lithuanian corpus. The structure of the tab delimited text file (dazninis.txt): HeadwordPart of SpeechWordformFrequency of Occurrence. The... -
English-Lithuanian Parallel Cybersecurity Corpus - DVITAS
English-Lithuanian parallel corpus DVITAS includes original English texts on cybersecurity and their Lithuanian translations aligned on the sentence level. The corpus was... -
Read Speech Corpus (7G)
The corpus of read Lithuanian speech „7G“ was compiled in 2015-2016. The corpus consists of 352 audio recordings with a total duration of over 7 hours. Seven different speakers... -
Corpus of Discourse on Crime
Specialised "Corpus of Discourse on Crime" is synchronic, monolingual, unannotated, consists of two subcorpora. Subcorpus 1: all texts on crime, published in criminal columns on... -
Lithuanian Word embeddings
GloVe type word vectors (embeddings) for Lithuanian. Delfi.lt corpus (~70 million words) and StanfordNLP were used for training. The training consisted of several stages: 1)... -
The Database of Lithuanian multiword expressions
The Database of Lithuanian multiword expressions (MWEs) is freely accessible for online search at: https://resursai.pastovu.vdu.lt/paieska/paprastoji from 2019. It contains... -
Assessment Data of the Dictionary of Modern Lithuanian versus Joint Corpora
The resource is the assessment data of The Dictionary of Modern Lithuanian, 6th edition (DML6) [1], from the point of view of its coverage in the Joint Corpus of Lithuanian... -
Lithuanian-English Cybersecurity Termbase v.0.1
The bilingual termbase is TBX export of the online termbase https://www.terminologue.org/csterms/. The termbase includes terms for 233 cybersecurity concepts. -
EMVAKA
Two Lithuanian language children’s corpora, collected during the EMVAKA project, consist of the Lithuanian language production by children aged 7–13: (1) spoken (73 files, c.... -
English-Lithuanian Comparable Vaccination Corpus
Two news portals were selected for comparable corpora building: the Lithuanian portal DELFI and the English portal The Guardian. The compiled corpora comprise 135 Lithuanian... -
Lithuanian Spelling Checker V.1.0.45 for macOS
Lithuanian spelling checker for macOS 2020-04-10 version 1.0.45