Read Speech Corpus (7G)

PID

The corpus of read Lithuanian speech „7G“ was compiled in 2015-2016. The corpus consists of 352 audio recordings with a total duration of over 7 hours. Seven different speakers are reading excerpts of books and a list of isolated words (the list reflects the diversity of triphones in the Lithuanian). The audio recordings are stored as WAV PCM 44.1 kHz 16-bit mono format files. Annotations are stored in MLF format (the format used by the HTK Toolkit). Most of the speakers are young women aged between 20 and 25. The aim was to obtain recordings in as natural a recording environment as possible, so no requirements were placed on the speakers in terms of recording equipment, microphone settings or recording environment. Most of the speakers used personal laptops with a built-in microphone.

Identifier
PID http://hdl.handle.net/20.500.11821/58
Metadata Access https://clarin.vdu.lt/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:clarin.vdu.lt:20.500.11821/58
Provenance
Creator Raškinis, Gailius; Rudžionis, Vytautas
Publisher Vytautas Magnus University; Vilnius University
Publication Year 2017
Rights ACA_CLARIN-LT_End-User-Licence-Agreement_EN-LT; https://clarin.vdu.lt/licenses/eula/ACA_CLARIN-LT_End-User-Licence-Agreement_EN-LT.htm; ACA
OpenAccess true
Contact info(at)clarin.vdu.lt
Representation
Language Lithuanian
Resource Type corpus
Format text/plain; application/zip; text/plain; charset=utf-8; downloadable_files_count: 3
Discipline Linguistics