English-Lithuanian Comparable Cybersecurity Corpus - DVITAS

PID

The English-Lithuanian comparable corpus (DVITAS COMPARABLE) is morphologically annotated. It includes English and Lithuanian original texts on cybersecurity from the time period of 2010-2021. The corpus was compiled for the bilingual terminology extraction project together with English-Lithuanian parallel corpus. There are 1,708 files in English and 2,567 for Lithuanian. The total size of the corpus is 4m words (EN-2m; LT-2m) The corpus is composed of texts representing 4 text types: academic (EN-19%; LT-30%), administrative-informative (EN-8%; LT-11%), legal (EN-18%; LT-4%), media (EN-55%; LT-55%).

Identifier
PID http://hdl.handle.net/20.500.11821/47
Related Identifier https://klc.vdu.lt/dvitas/en
Metadata Access https://clarin.vdu.lt/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:clarin.vdu.lt:20.500.11821/47
Provenance
Creator Utka, Andrius; Rackevičienė, Sigita; Rokas, Aivaras; Bielinskienė, Agnė; Mockienė, Liudmila; Laurinaitis, Marius
Publisher Vytautas Magnus university; Mykolas Romeris university
Publication Year 2022
Rights ACA_CLARIN-LT_End-User-Licence-Agreement_EN-LT; https://clarin.vdu.lt/licenses/eula/ACA_CLARIN-LT_End-User-Licence-Agreement_EN-LT.htm; ACA
OpenAccess true
Contact info(at)clarin.vdu.lt
Representation
Language English; Lithuanian
Resource Type corpus
Format text/plain; charset=utf-8; text/plain; application/zip; downloadable_files_count: 4
Discipline Linguistics