-
SYN2010: balanced corpus of written Czech
Balanced corpus of contemporary written Czech sized 100 MW. It was created as a representation of written language from 2005–2009 and thus it contains a wide range of text types... -
ORTOFON v1: balanced corpus of informal spoken Czech with multi-tier transcri...
ORTOFON v1 is designed as a representation of authentic spoken Czech used in informal situations (private environment, spontaneity, unpreparedness etc.) in the area of the whole... -
Corpus "Miljons"
Balanced corpus of Modern Latvian (~ 1 million running words, currently in plain-text), publicly available via Bonito interface -
SYN2005: balanced corpus of written Czech
Balanced corpus of contemporary written Czech sized 100 MW. It was created as a representation of written language from 2000–2004 and thus it contains a wide range of text types... -
ORAL2008: Balanced corpus of informal spoken Czech
Balanced corpus of informal spoken Czech sized 1 MW. It contains transcriptions of 297 recordings made in 2002–2007 in the whole of Bohemia. All the recordings were made in... -
ORAL2013: balanced corpus of informal spoken Czech (transcriptions & audio)
ORAL2013 is designed as a representation of authentic spoken Czech used in informal situations (private environment, spontaneity, unpreparedness etc.) in the area of the whole... -
ORTOFON v1: balanced corpus of informal spoken Czech with multi-tier transcri...
ORTOFON v1 is designed as a representation of authentic spoken Czech used in informal situations (private environment, spontaneity, unpreparedness etc.) in the area of the whole... -
ORAL2013: balanced corpus of informal spoken Czech (transcriptions)
ORAL2013 is designed as a representation of authentic spoken Czech used in informal situations (private environment, spontaneity, unpreparedness etc.) in the area of the whole...