Written corpus ccKres 1.0

PID

Corpus ccKres consists of 9,376 documents, each containing information about the source (e.g. newspapers, magazines), year of publication, text type (fiction, newspaper), the title and author if they are known. The corpus is POS-tagged and lemmatised, and encoded in XML TEI format (Text Encoding Initiative P5). The ccKres corpus contains approximately 9% of the Kres corpus, a balanced corpus of Slovene: http://eng.slovenscina.eu/korpusi/kres.

Identifier
PID http://hdl.handle.net/11356/1034
Related Identifier http://eng.slovenscina.eu/korpusi/proste-zbirke
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1034
Provenance
Creator Logar, Nataša; Erjavec, Tomaž; Krek, Simon; Grčar, Miha; Holozan, Peter
Publisher Centre for Language Resources and Technologies, University of Ljubljana
Publication Year 2013
Rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0); https://creativecommons.org/licenses/by-nc-sa/4.0/; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language Slovenian; Slovene
Resource Type corpus
Format application/zip; application/gzip; text/plain; charset=utf-8; downloadable_files_count: 3
Discipline Linguistics