Dataset - B2FIND

Information structure and historical English OV/VO variation

This dataset contains the data that is used in: Struik, Tara and Ans van Kemenade. Information structure and OV word order in Old and Middle English: a phase-based approach. To...

B2 Hausa

Hausa: complete set, status: final, manually transcribed, glossed and translated to English, annotated wrt. morphology, parts of speech, syntax, gramm. function, sem. roles,...

B1 Yom

The data sets for each language consist of a small number of mini-dialogues, chosen out of the 189 entries within the Focus Translation Task (cf. Skopeteas et al. 2006: 209ff.)...

B1 Foodo

The data sets for each language consist of a small number of mini-dialogues, chosen out of the 189 entries within the Focus Translation Task (cf. Skopeteas et al. 2006: 209ff.)...

B4 Ludolf

The texts of this corpus, Ludolf von Sudheims Reise ins Heilige Land (Ludolf of Sudheim's Journey to the Holy Land), is a journey diary describing the adventures of a group of...

B4 Otfrid

Das Referenzkorpus Altdeutsch erfasst und annotiert die ältesten Sprachdenkmäler des Deutschen vom Beginn der kontinuierlichen schriftlichen Überlieferung um 750 bis etwa 1050...

A5 Hausa News

This corpus of news articles from the online news service of Deutsche Welle contains 4 texts with a total of 2017 tokens. CLARIN Metadata summary for A5 Hausa News...

B4 Heliand

Heliand 1, 4 and 5: complete text, status: final, digitalization, translation to Modern German, manually annotated with parts of speech, syntactic categories, grammatical...

B1 Aja

The data sets for each language consist of a small number of mini-dialogues, chosen out of the 189 entries within the Focus Translation Task (cf. Skopeteas et al. 2006: 209ff.)...

B4 Tatian Corpus of Deviating Examples 2.1

The present corpus, the Tatian Corpus of Deviating Examples T-CODEX 2.1, provides morpho-syntactic and information structural annotation of parts of the Old High German...

A5 Hausa Umarnin Uwa

This corpus of Umarnin Uwa film transcripts contains 47 transcripts with a total of 10194 tokens. It provides information including automatic POS tagging, speaker and...

B4 Historisches Predigtenkorpus zum Nachfeld

HIPKON is the first corpus based on only one text type (sermons) and on one dialect area, Upper German (Bavarian-Alemannic). The sermons cover the time from Middle High German...

B2 Marghi

Full set: all focus related experiments, status: work in progress, large parts elicited, most of the data transcribed, partly annotated. CLARIN Metadata summary for B2 Marghi...

B1 Fon

The data sets for each language consist of a small number of mini-dialogues, chosen out of the 189 entries within the Focus Translation Task (cf. Skopeteas et al. 2006: 209ff.)...

B2 Guruntum

Guruntum sample: sample, status: final, manually transcribed, glossed and translated to English, annotated wrt. morphology, parts of speech, syntax, gramm. function, sem. roles,...

B7 Wolof (Wikipedia)

The corpus comprises out of a collection of texts from the Wolof Wikipedia, randomly chosen for their near-standard like orthography and language, and treating different topics....

B2 Bura

Full set: all focus related experiments, status: work in progress, large parts elicited, most of the data transcribed, partly annotated CLARIN Metadata summary for B2 Bura...

B4 Sächsische Weltchronik

The corpus contains a chronic from the 13th century in Middle Low German. Es handelt sich um eine Chronik, in Mittelniederdeutsch, 13 Jh. Beschreibung der Textzeugen usw. in:...

B7 Wolof (web)

The corpus comprises out of a collection of texts from discussion forums in the web, randomly chosen for their near-standard like orthography and language, and treating...

B4 Muspilli

Complete text, status: work in progress, digitalization, translation to English, manually annotated with parts of speech, syntactic category, grammatical function, clause...

21 datasets found