678 datasets found

Language: English

Filter Results
  • WordnetLoom 2.0

    WordneLoom 2.0 executable files for plWordnet 4.0. Source code available at https://github.com/CLARIN-PL/WordnetLoom WordnetLoom – is an wordnet editor application built for the...
  • HaskEN

    HaskEN is an English phraseological database designed for language professionals including linguists, language teachers, lexicographers, language materials developers and...
  • plWordNet 4.2 (CLARIN-BIZ-START)

    plWordNet (Słowosieć) from Juli 2020, used as the main resources for word sense disambiguation tasks in 2020-2022; the database includes also the mapping to Priceton WordNet 3.1...
  • plWordNet 4.5

    PLWordNet ver. 4.5 is a lexico-semantic network that reflects the lexical system of the Polish language with projection to the English language. Słowosieć, Princeton Wordnet,...
  • WiKNN Text Classifier

    WiKNN is an online text classifier service for Polish and English texts. It supports hierarchical labelled classification of user-submitted texts with Wikipedia categories....
  • The LnNor Corpus: A spoken multilingual corpus of non-native and native Norwe...

    The LnNor corpus was created as part of the data collection in two projects: CLIMAD (Cross- linguistic influence in multilingualism across domains: phonology and syntax) and...
  • Paralela corpus and search engine

    Paralela is as an open-ended, opportunistic parallel corpus of Polish-English and English-Polish translations. It currently contains 262 million words in 10,877,000 translation...
  • enWordNet 1.0

    The extension of Princeton WordNet built within the CLARIN-PL project. The attached file also contains the mapping to Open Multilingual Wordnet.
  • EWBST tests for english

    Submission contains test generated for EWBST test of English word embedding models. Tests were created with princeton wordnet and plWN english synsts.
  • PolEmo 1.0 + MultiEmo-Test 1.0 Multilingual Sentiment Analysis Dataset for KE...

    PolEmo 1.0 + MultiEmo-Test 1.0: Corpus of Multi-Domain Consumer Reviews. Test dataset from PolEmo 1.0 was translated to eight different languages: Dutch, English, French,...
  • NE_SUMO_PLWN_mapping

    Mapping between named entities types, SUMO catagories and plWordNet synsets
  • EU Parliament Speech corpus

    A collection of 1040 EU parliament speeches with transcription and annotations. Includes original speeches and PL/EN translations.
  • Novels_Dabrowska_Dzikie_ziele

    Body of Maria Dąbrowska "Wild herb" from the collection of the Scriptures selected. Stories, passages, dramas, songs for children.
  • Corpus-SUCK

    Proces przetwarzania umożliwia pobranie zawartości serwisów internetowych. Wejściem dla procesu jest lista adresów URL, na wyjściu uzyskuje się zbiór plików zawierających...
  • Rafal

    korpus
  • WordnetLoom 1.68.2

    WordnetLoom – is an wordnet editor application built for the needs of the construction of a the largest Polish wordnet called plWordNet. WordnetLoom provides two means of...
  • SlopeQ for BNC Search Engine

    The SlopeQ for BNC Search Engine provides access to the British National Corpus dataset. In addition to linguistically motivated corpus queries, it supports a number of data...
  • C1_essays

    C1 essays