Pretrained word and multi-sense embeddings for Estonian

DOI

Word and multi-sense embedding for Estonian trained on lemmatized etTenTen: Corpus of the Estonian Web. Word embeddings are trained with word2vec. Sense embeddings are trained with SenseGram. Sense inventory is induced from word embeddings. Models were trained using various parameter settings. The values of architecture, number of dimensions, window size, minimum frequency threshold and number of iterations vary.

Identifier
DOI http://datadoi.ee/handle/33/91
Metadata Access https://datadoi.ee/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:datadoi.ee:33/91
Provenance
Creator Aedmaa, Eleri
Publisher University of Tartu
Publication Year 2019
Rights info:eu-repo/semantics/openAccess
OpenAccess true
Representation
Resource Type info:eu-repo/semantics/dataset
Format application/pdf
Discipline Other