-
Replication Data for: Accusative of Negation in ‘Borderland’ Polish
These are the data for a journal article on 'Accusative of Negation in 'Borderland' Polish'. The abstract of the article is below. The data consist of the annotated list of... -
ChunkRel WS
ChunkRel-WS is a prototype service for recognition of three syntactic relations between chunks. The service may be run against plain text (input format: text), then the... -
Big Data language model - subword - BPE - ARPA
Big data language model based on subword units, based on byte pair encoding in ARPA format -
Świgra — a parser of Polish
Świgra is a parser of Polish generating constituency trees using a DCG style grammar stemming from Marek Świdziński’s grammar “Gramatyka formalna języka polskiego” (1992). The... -
Verb in plWordNet 4.0 (Guidelines)
The pdf document contains the guidelines of description of Verbs in the Polish part of plWordNet. -
Big data language model with part of speech tags stemmed in ARPA format
Big data language model with part of speech tags stemmed in ARPA format -
Pred-A-tor
Tool for creating predicate-argument structures based on syntactic trees created by Świgra parser (http://zil.ipipan.waw.pl/%C5%9Awigra) -
Big Data language model - subword - SYLLABED - ARPA
Big data language model based on syllabes in ARPA format. -
Polish-Bulgarian Parallel Corpus
Polish-Bulgarian Parallel Corpus -
Extended dictionary of named entities NELexicon connected with Linked Open Data
This resource contains Polish named entities connected with terminology from available resources within Linked Open Data (e.g. WordNet, DBPedia, Wikipedia, etc.). -
Big data language model stemmed in ARPA format
Big data language model stemmed in ARPA format. -
Big data language model with part of speech tags stemmed in RAW format
Big data language model with part of speech tags stemmed in RAW format -
Big data language model stemmed with BPE in ARPA format
Big data language model stemmed with BPE in ARPA format -
WiKNN Text Classifier
WiKNN is an online text classifier service for Polish and English texts. It supports hierarchical labelled classification of user-submitted texts with Wikipedia categories.... -
MorphoDiTa-based tagger for Polish language
MorphoDiTa-based tagger for Polish language. It is a tool for morphosyntactic unification for the Polish language, according to the NKJP tagset. -
KGR10 FastText Polish word embeddings
Distributional language model (both textual and binary) for Polish (word embeddings) trained on KGR10 corpus (over 4 billion of words) using Fasttext with the following variants... -
POLFIE Bank, an LFG structure bank of Polish: pol-nkjp1m-pargram-dev
The pol-nkjp1m-pargram-dev structure bank was created using POLFIE: an LFG grammar of Polish. This structure bank contains sentences from the NKJP1M subcorpus of NKJP which were... -
PolEmo 2.0 Sentiment Analysis Dataset for CoNLL
PolEmo 2.0: Corpus of Multi-Domain Consumer Reviews, evaluation data for article presented at CoNLL Citation: @inproceedings{kocon-etal-2019-multi, title = "Multi-Level... -
Polish-Ukrainian Parallel Corpus
Polish-Ukrainian Parallel Corpus -
Big Data language model - STEMMED - RAW data
Big data language model stemmed in RAW format