SYNERGY - Open machine learning dataset on study selection in systematic reviews

DOI

SYNERGY is a free and open dataset on study selection in systematic reviews, comprising 169,288 academic works from 26 systematic reviews. Only 2,834 (1.67%) of the academic works in the binary classified dataset are included in the systematic reviews. This makes the SYNERGY dataset a unique dataset for the development of information retrieval algorithms, especially for sparse labels. Due to the many available variables available per record (i.e. titles, abstracts, authors, references, topics), this dataset is useful for researchers in NLP, machine learning, network analysis, and more. In total, the dataset contains 82,668,134 trainable data points.

The easiest way to get the SYNERGY dataset is via the synergy-dataset Python package. See https://github.com/asreview/synergy-dataset for all information.

The recommended way to work with the SYNERGY dataset is via the Python package "synergy-dataset". This flexible package downloads and builds the dataset.

Identifier
DOI https://doi.org/10.34894/HE6NAQ
Metadata Access https://dataverse.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.34894/HE6NAQ
Provenance
Creator De Bruin, Jonathan ORCID logo; Ma, Yongchao ORCID logo; Ferdinands, Gerbrich ORCID logo; Teijema, Jelle ORCID logo; Van de Schoot, Rens ORCID logo
Publisher DataverseNL
Contributor de Bruin, Jonathan; Van de Schoot, Rens
Publication Year 2023
Rights CC0 1.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/publicdomain/zero/1.0
OpenAccess true
Contact de Bruin, Jonathan (Utrecht University); Van de Schoot, Rens (Utrecht University)
Representation
Resource Type Dataset
Format text/plain; text/csv; application/json; application/zip
Size 200; 266; 306; 331; 338; 322; 283; 305; 258; 320; 299; 415; 212; 315; 242; 237; 255; 185; 294; 316; 401; 317; 279; 263; 220; 310; 436; 989865; 150923; 103010; 852937; 654705; 187163; 317928; 3944082; 234822; 133950; 42227; 215250; 31919; 5372708; 388378; 460956; 672721; 746832; 114789; 19786; 96153; 451336; 490660; 535584; 63996; 298011; 31652; 19426; 14326; 469; 465; 468; 756; 908; 685; 568; 475; 533; 679; 577; 473; 691; 702; 470; 699; 546; 713; 578; 701; 906; 26426; 22477; 21552; 23201; 18392; 26090; 22778; 19490; 23775; 15452; 27224; 17741; 23550; 19633; 23494; 35711; 18638; 29397; 30058; 25491; 15345; 25908; 24548; 30628; 33943; 19662; 24619707; 1028035; 835096; 16028323; 2989015; 62992826; 11840006; 10396095; 3560967; 738232; 2106498; 4411063; 2404439; 14708207; 18534723; 11463283; 66466788; 3526740; 2355371; 5543909; 19428163; 5749455; 14007470; 15085090; 8164831; 6019723; 58869042; 34788857
Version 1.0
Discipline Agriculture, Forestry, Horticulture, Aquaculture; Agriculture, Forestry, Horticulture, Aquaculture and Veterinary Medicine; Construction Engineering and Architecture; Engineering; Engineering Sciences; Life Sciences; Medicine; Social Sciences; Social and Behavioural Sciences; Soil Sciences