The CLEF-IP 2009 Test Collection

DOI

CLEF-IP: Cross-Language Evaluation Forum - Intellectual Property The CLEF-IP track was launched in 2009 to investigate IR techniques for patent retrieval and it is part of the CLEF 2009 evaluation campaign.The track utilizes a collection of more than 1M patent documents derived from EPO (European Patent Office) sources. The collection contains documents in English, French and German with at least 100,000 documents in each language. The task is to find patent documents that constitute prior art. The topics are complete patent documents that participants can process to extract queries. In addition to the Main task, CLEF-IP 2009 provided three language tasks (English, German, French) where topics were in one of these three languages. Relevance judgements were produced by two methods: automatically, using patent citations from seed patents; and manual for a small number of queries for which search results will be reviewed by Intellectual Property Experts. Files

Document CollectionThe CLEF-IP 2009 collection of documents consists of XML files. There are 1,9 million XML files, corresponding to approximately 1 million individual patents filed between 1985 and 2000. A dtd file for the XML format is provided as well. Topics and Answers (Qrels)Both the training and the test topic sets contain also the relevance assessments for the topics. For each task of the CLEF-IP 09 track, we provide 4 sets of different sizes of topic test sets: XLarge, Large, Medium, Small. GuidelinesContains detailed explanation on how to work with the four tasks from the corpus.

Identifier
DOI https://doi.org/10.48436/a2svx-p1y38
Related Identifier IsDescribedBy https://doi.org/10.1007/978-3-642-15754-7_47
Related Identifier IsVersionOf https://doi.org/10.48436/9sxbq-js515
Metadata Access https://researchdata.tuwien.ac.at/oai2d?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:researchdata.tuwien.ac.at:a2svx-p1y38
Provenance
Creator Piroi, Florina ORCID logo; Roda, Giovanna; Zenz, Veronika; Tait, John
Publisher TU Wien
Contributor Piroi, Florina
Publication Year 2021
Rights Creative Commons Attribution Non Commercial Share Alike 3.0 Unported; https://creativecommons.org/licenses/by-nc-sa/3.0/legalcode
OpenAccess true
Contact Piroi, Florina (TU Wien)
Representation
Language English
Resource Type Dataset
Version 1.0.0
Discipline Other