-
Language Technology Research Bibliography for Lithuanian 2016-2020
The language technology bibliography for Lithuanian language in the period 2016-2020. The resource is in BibTex format and it contains: 1) 91 references of research... -
ORVELIT v3
ORVELIT v3 (Lith.Originalios ir Vertimų Lietuvių Kalbos Tekstynas) is a comparable monolingual corpus of original and translated Lithuanian consisting of four sub-corpora of... -
Lithuanian Coreference Corpus
Lithuanian Coreference Corpus The corpus is made out of 100 articles from news portals focusing on political news, as such texts are rich in quotations and named entity... -
Lithuanian Corpus of the EU Primary and Secondary Law Acts of the Period 2015...
274,460 word corpus comprised of selected primary and secondary law acts of the EU of the period 2015-2017. The corpus was compiled of documents containing words with the root... -
ELMo Embeddings for Polish
A model of ELMo embeddings for Polish language trained on large textual corpora (KGR10). To retrain the model please use the checkpoint and vocabulary files available at:... -
Word Embeddings for Polish
Distributional language models for Polish trained on different corpora (KGR10, NKJP, Wikipedia). -
Knowledge base of Polish conventionalized periphrastic nominal expressions
The resource includes free Periphraser export with a knowledge base of Polish conventionalized periphrastic nominal expressions (i.e. phrases headed by a noun) together with... -
Corpus2MWE
A CCL reader (Corpus2) with MWE detection. -
SuperMatrix
SuperMatrix is a system to support automatic extraction of semantic relations, based on the analysis of large text corpora. System was developed as a tool for expansion of... -
Lexicalisation of Polish and English word combinations: two samples manually ...
We analysed over 350 Polish and English word combinations (multi-word expressions, MWEs). Half of the sample was drawn from traditional dictionaries, while the other half was... -
Toposław 2 (2016-05-31)
Toposław 2 is an editor of multi-world unit inflection lexicons. -
Pan Tadeusz
poemat -
Integrated Parser
Integrated parser is an application that combines and normalizes outputs of several parsers for Polish. It is based on ENIAM processing stream extended with Polish Dependency... -
WUT Relations Between Sentences Corpus
WUT Relations Between Sentences Corpus contains 2827 pairs of related sentences. Relationships are derived from Cross-document Structure Theory (CST), which enables... -
CEN
Corpus of Economic News (CEN) contains 797 documents from Polish Wikipedia annotated with 65 categories of proper names in ccl format.... -
Polish-Ukrainian Parallel Corpus
Polish-Ukrainian Parallel Corpus -
MultiEmo: Multilingual, Multilevel, Multidomain Sentiment Analysis Corpus of ...
MultiEmo, a new benchmark data set for the multilingual sentiment analysis task including 11 languages. The collection contains consumer reviews from four domains: medicine,... -
Plumper
Ontology mapper. Mapping plWordNet onto SUMO ontology. -
POLFIE-OT: an LFG grammar of Polish with OT marks
POLFIE-OT is a version of POLFIE, an LFG grammar of Polish implemented in the XLE system (Xerox Linguistic Environment), enriched with OT (Optimality Theory) constraints for the... -
WCRFT WebLichtService
WCRFT service for WebLicht