-
Word Embeddings for Polish
Distributional language models for Polish trained on different corpora (KGR10, NKJP, Wikipedia). -
Tests for Word Embeddings
Evaluation tools (WBST, HWBST, EWBST) for word embedding models used to assess and compare the usefulness of different word embeddings -
Vector representations of polish words (Word2Vec method)
Model skip gram with vectors of length 100. Trained on kgr 10, a corpora with over 4 billion tokens. Data preprocessing involved segmentation, lemmatization and mophosyntactic...