-
INEL Enets Corpus
Corpus Citation Shluinsky, Andrey; Khanina, Olesya; Wagner-Nagy, Beáta. 2024. INEL Enets Corpus. Version 1.0. Publication date 2024-11-30.... -
MeerKAT: Meerkat Kalahari Audio Transcripts
A large-scale reference dataset for bioacoustics Please find the accompanying code at our official repository: github.com/livingingroups/animal2vec [Optional ]You can find the... -
Pretrained models for animal2vec and MeerKAT: A self-supervised transformer f...
Model weights for animal2vec Please find the accompanying code at our official repository: github.com/livingingroups/animal2vec Here you find the model weights for the... -
ISLEX Dictionary (audio) (ELEXIS)
The data contains audio files for the Icelandic lemmas of the ISLEX dictionary. -
Annotated Route Description
This file set existing of a video stream, an audio stream and a multimodal annotation file is a frequently used as show case of how to do complex multimodal annotations with the... -
Czech Malach Cross-lingual Speech Retrieval Test Collection
The package contains Czech recordings of the Visual History Archive which consists of the interviews with the Holocaust survivors. The archive consists of audio recordings, four... -
Prague Dependency Treebank of Spoken Czech 2.0 (PDTSC 2.0)
The Prague Dependency Treebank of Spoken Czech 2.0 (PDTSC 2.0) is a corpus of spoken language, consisting of 742,316 tokens and 73,835 sentences, representing 7,324 minutes... -
Audio and video support in EXMARaLDA
A short introduction to audio and video support in EXMARaLDA