-
Prague Dependency Treebank - Consolidated 2.0 (PDT-C 2.0)
A manually annotated and genre-diversified language resource with rich linguistic information from morphology and syntax to semantics, the Prague Dependency Treebank –... -
Prague Dependency Treebank - Consolidated 1.0 (PDT-C 1.0)
A richly annotated and genre-diversified language resource, The Prague Dependency Treebank – Consolidated 1.0 (PDT-C 1.0, or PDT-C in short in the sequel) is a consolidated... -
Wizerunek Andreja Babiša i Mateusza Morawieckiego w kontekście sytuacji kryzy...
Zbiór artykułów z prasy czeskiej dotyczący Mateusza Morawickiegi (iDnes) oraz z prasy polskiej dotyczących Andreja Babiša (Rzeczpospolita) -
[MCSQ]: The Multilingual Corpus of Survey Questionnaires
The Multilingual Corpus of Survey Questionnaires (MCSQ) is the very first publicly available multilingual database comprised of international survey texts. Its latest version... -
Multilingual Constructicon (2017-10-16) Flerspråkigt konstruktikon (2017-10-16)
A multilingual constructicon. Ett flerspråkigt konstruktikon. -
ASPAC – Swedish-Czech (2017-10-16) ASPAC – svenska-tjeckiska (2017-10-16)
Part of The Amsterdam Slavic Parallel Aligned Corpus. The material is sentence scrambled. Del av The Amsterdam Slavic Parallel Aligned Corpus. Materialet är meningsomkastat. -
JRC EU DGT Translation Memory Parsebank DGT-UD 1.0
DGT-UD is a 2 billion word 23-language parallel syntactically parsed corpus, which consists of the JRC DGT translation memory of European law, automatically annotated with... -
Multilingual comparable corpora of parliamentary debates ParlaMint 3.0
ParlaMint 3.0 is a multilingual set of 26 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2022, with the individual corpora... -
Linguistically annotated multilingual comparable corpora of parliamentary deb...
ParlaMint 2.1 is a multilingual set of 17 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20... -
MULTEXT-East "1984" document corpus 4.0
The novel "1984" by George Orwell is the central component of the MULTEXT-East corpus. This parallel and sentence aligned corpus contains the novel in the English original... -
Concreteness and imageability lexicon MEGA.HR-Crossling
The lexicon contains concreteness and imageability predictions of words in 77 languages. The resource is built via supervised machine learning, using average human responses... -
The multilingual sentiment dataset of parliamentary debates ParlaSent 1.0
The dataset consists of mid-length sentences from the parliamentary proceedings of Bosnia and Herzegovina, Croatia, Czechia, Serbia, Slovakia, Slovenia, and the United Kingdom,... -
Linguistically annotated multilingual comparable corpora of parliamentary deb...
ParlaMint is a multilingual set of comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20 million... -
Multilingual comparable corpora of parliamentary debates ParlaMint 4.0
ParlaMint 4.0 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and... -
Linguistically annotated multilingual comparable corpora of parliamentary deb...
ParlaMint 3.0 is a multilingual set of 26 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2022, with the individual corpora... -
MULTEXT-East "1984" annotated corpus 4.0
The novel "1984" by George Orwell is the central component of the MULTEXT-East corpus. This parallel and sentence aligned corpus contains the novel in the English original... -
Parliamentary spoken corpus of Czech ParlaSpeech-CZ 1.0
The ParlaSpeech-CZ dataset is built from the transcripts of parliamentary proceedings available in the Czech part of the ParlaMint corpus, and the parliamentary recordings... -
Linguistically annotated multilingual comparable corpora of parliamentary deb...
ParlaMint 4.0 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and... -
Linguistically annotated multilingual comparable corpora of parliamentary deb...
ParlaMint 4.1 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and... -
MULTEXT-East free lexicons 4.0
The MULTEXT-East morphosyntactic lexicons have a simple structure, where each line is a lexical entry with three tab-separated fields: (1) the word-form, the inflected form of...