PANACEA Italian V-SUBCAT Repubblica lexicon (language independent extractor)
This is a lexicon of verb subcategorisation frames automatically extracted from a 300Mio words newspaper corpus using a language independent SCF acquisition software. The... -
PANACEA Annotated Dependency Italian Environment Corpus Version 2
- PANACEA Annotated Italian Environment Corpus Version 2 consists of Italian texts in the Environment (ENV) domain that were collected and automatically annotated in the... -
PANACEA Labour and Parole merged Italian Lexicon
The Italian PAROLE_lab_merged.lmf.xml is SCF lexicon obtained by merging two automatically extracted lexicons: a domain lexicon (labour) pANACEA_SCF_IT_labour.lmf.xml and a the... -
Spanish LMF Parole/Simple Lexicon
This is the LMF version of the Spanish Parole-Simple lexicon. The original PAROLE lexica (20,000 entries per language) were built conform to a model based on EAGLES guidelines... -
PANACEA Annotated Dependency Greek Labour Legislation Corpus Version 2
PANACEA Annotated Greek Labour Legislation Corpus Version 2 consists of Greek texts in the Labour Legislation (LAB) domain that were collected and automatically annotated in the... -
Catalan LMF Freeling Sense
This is the LMF version of the Catalan Freeling Sense. FreeLing is a developer-oriented library providing language analysis services. FreeLing is designed to be used as an... -
PANACEA English automatically acquired lexicon for LAB domain: Subcategorizat...
This is a domain-specific lexicon for English for labour (LAB) domain. This lexicon contain both, subcategorization frames for verbs and lexical semantic classes for nouns. This... -
PANACEA Italian V-SUBCAT gold-standard for ENV domain
- The PANACEA_SCF_Gold_ENV_IT is a manually created "gold-standard" lexicon of verbal subcategorisation frames for 26 verb lemmas. The language is Italian and the domain is... -
GrAF version of Spanish portions of Wikipedia Corpus
This is the stand-off GrAF version of Spanish portions of the Wikipedia (based on a 2006 dump). This Wikipedia Spanish Corpus contains 257019 articles that contain about 150,1... -
PANACEA Environment Corpus n-grams IT (Italian)
This data set contains Italian word n-grams and Italian word/tag/lemma n-grams in the "Environment" (ENV) domain. N-grams are accompanied by their observed frequency counts. The... -
Galician LMF Apertium Dictionary
This is the LMF version of the Galician Apertium dictionary. Monolingual dictionaries for Spanish, Catalan, Galician and Euskera have been generated from the Apertium expanded... -
PANACEA Environment SCF MWE merged Italian Lexicon
- The Italian PANACEA_ENV_MWE_SCF_merged.lmf.xml lexicon is obtained by merging two automatically extracted lexicons: a domain lexicon (environment) for SCFs,... -
Spanish-Romanian LMF Apertium Bilingual dictionary
This is the LMF version of the Apertium bilingual dictionary for Spanish Romanian languages. Bilingual LMF dictionaries were generated from Apertium bilingual dix files. For... -
MATE Parser module for Spanish
In this package we include the following: logonFinal20130315_4matetools361.model; parse_ESCAsentences_mate.sh; freeling_spaMate.sh; toconll2006.py; prueba.txt (test file: 4... -
Spanish-Galician LMF Apertium Bilingual dictionary
This is the LMF version of the Apertium bilingual dictionary for Spanish Galician languages. Bilingual LMF dictionaries were generated from Apertium bilingual dix files. For... -
PANACEA Environment Bilingual Glossary French-to-English
- This glossary contains terminology in French-to-English, with a focus on environmental terms, resulting from PANACEA research. It contains about 3846 entries, both single... -
PANACEA Labour Bilingual Glossary FR-EN (French-English)
This folder contains files for bilingual glossary creation from factored phrase tables that include part of speech tagged text for for FR-EN language pair. The tables are... -
PANACEA Annotated Dependency Italian Labour Legislation Corpus Version 2
- PANACEA Annotated Italian Labour Legislation Corpus Version 2 consists of Italian texts in the Labour Legislation (LAB) domain that were collected and automatically annotated... -
GrAF version of the Basque Dependency Treebank
This is the stand-off GrAF version of the Basque Dependency Treebank (BDT). It is the Reference Corpus for the Processing of Basque (EPEC) annotated at syntactic level. EPEC is...