-
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2022 – VERSION 1)
german version see below The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the... -
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2023 – VERSION 1)
german version see below The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the... -
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2024 – VERSION 1)
german version see below The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the... -
Amharic WIC Corpus
Substantially cleaned version of existing morphologically annotated WIC Corpus. -
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2014 – VERSION 1)
german version see below The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the... -
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2016 – VERSION 1)
german version see below The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the... -
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2020 – VERSION 1)
german version see below The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the... -
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2017 – VERSION 1)
german version see below The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the... -
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2021 – VERSION 1)
german version see below The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the... -
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2013 – VERSION 1)
german version see below The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the... -
Tigrinya Web Corpus
Tigrinya web corpus. Crawled by SpiderLing in January 2016. Encoded in UTF-8, cleaned, deduplicated. -
CEHugeWebCorpus
This corpus was originally created for performance testing (server infrastructure CorpusExplorer - see: diskurslinguistik.net / diskursmonitor.de). It includes the filtered... -
Somali Web Corpus
Somali web corpus. Crawled by SpiderLing in January 2016. Encoded in UTF-8, cleaned, deduplicated. -
HWC2023 –Hamburg.de Website Corpus 2023
A petition for a referendum (called: "Schluss mit Gendersprache in Verwaltung und Bildung" / eng.: "abolition of gender language in administration and education") was formed in... -
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2019 – VERSION 1)
german version see below The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the...