Corresponding and Non-Corresponding Edit-Turn-Pairs from the English Wikipedia. The ETP-gold corpus is based on article edits and discussion page turns from the English Wikipedia. The ETP-gold-labels MTurk dataset contains the labels and metadata from the crowdsource annotation task. For the edit-turn-pair detection task, please refer to/cite: Johannes Daxenberger and Iryna Gurevych (2014): "Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia." In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Short Papers. For the crowdsource annotation, please refer to/cite:
Emily K. Jamison and Iryna Gurevych (2014). "Needle in a Haystack: Reducing the Costs of Annotating Rare-Class Instances in Imbalanced Datasets." In: Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing.