Pre-processing data for the Mega Meta Project

DOI

It is key to understand the etiology and risks for the onset, relapse, and chronicity of common mental disorders to identify people at risk and improve preventive and acute treatment interventions. However, there is a lack of overview of the evidence for factors that predict or are related to common mental disorders. Due to a big data problem, it is impossible to synthesize all evidence using traditional systematic reviews.The mega meta project, funded by Centre for Urban Mental Health and a cooperation between Amsterdam UMC, University of Utrecht, and University of Amsterdam, is a large systematic review that aimed to synthesize (meta-analyze) all prospective evidence for factors, mechanisms of change and interaction of factors related to the onset, maintenance, and relapse/recurrence of three common mental disorders: Anxiety, substance use, and depressive disorders. The systematic searches, selection, and data checks were conducted using ASReview between June 2021 and July 2022.This DANS dataset is the result of https://github.com/asreview/paper-megameta-postprocessing-screeningresults# The MegaMeta Output filesThis repository contains the output files and the final data of the so-called,Mega-Meta study on reviewing factors contributing to substance use, anxiety,and depressive disorders. The scripts used to generate the output can befound here: https://doi.org/10.5281/zenodo.5803268 (https://github.com/asreview/paper-megameta-postprocessing-screeningresults)## Output filesThe Steps & FilesEach step consists of input and output files.On this data repository only the output files are stored.The output of one step serves as the input for the next step.The final result of the project can be found in the Final Data folder.To replicate the study up until the final results, you can follow these steps:### 1. Search:The input and protocol for the search strategy can be found on this Open Science Framework(OSF) repository: https://doi.org/10.17605/OSF.IO/M5UHY.1_Search_OutputThis folder contains RIS formatted .txt files for each of thethree subjects: substance use, anxiety, and depressive disorders.### 2. Preprocessing:The outout from the previous step, the search, is the input for the preprocessing step.More information about the preprocessing scripts and protocol can also be found withinthe OSF repository mentioned above: https://doi.org/10.17605/OSF.IO/M5UHYIn short, the preprocessing consists of:- Updating the references in EndNote- Deduplicating the references in EndNote- Deduplicating the references based on DOI in R- Labeling inclusions and exclusions, which are to be used as prior knowledge.2_Preprocessing_OutputThis folder contains three .csv files, or in other words three datasets.These datasets have been partly labeled, meaning that some of the records have beenlabeled as either relevant or irrelevant. These labeled records are also knownas the prior knowledge, which is necessary for the next step.### 3. Screening phase 1:The input for the first screening phase are the partly labeled datasets from step 2.The Screening protocol which is used in Screening phase 1 can be found here:https://doi.org/10.17605/OSF.IO/3ZNAR.3_ScreeningPhase1_OutputThis folder contains six files, both an .xlsx and an .asreview file for each of the subjects.The .xlsx file is a human readable dataset, containing the screening decisions made bythe screeners from screening phase 1. The .asreview file, is a project file which can beuploaded to ASReview LAB to see all the decisions that have been made within the softwareitself. It also contains all the information on the trained model and settings up until thatpoint.### 4. Screening phase 2:The input for the second screening phase are the .xlsx files from the first screening phase.However, in the second screening phase, a different machine learning model, a 17-layerConvolutional Neural Network, was used to optimize the screening progress.Find out more about the different model and how the hyperparameters were trainedin the GitHub repository:https://github.com/asreview/paper-megameta-hyperparameter-training4_ScreeningPhase2_OutputSimilar to the previous step, six files are present in the 4_ScreeningPhase2_Output folder:A .xlsx and a .asreview file per subject (anxiety, depression and substance abuse).These files contain both the screening decisions from the first and from the secondscreening phase.### 5.Postprocessing:The input for the postprocessing steps are the .xlsx files from the second screening phase.Read more about the postprocessing steps within this GitHub repository:https://github.com/asreview/paper-megameta-postprocessing-screeningresults5_Postprocessing_OutputThroughout the postprocessing, there are several files outputted which serve again as inputfor the next part in the postprocessing pipeline.- Merging the three .xlsx files results in: megameta_asreview_merged.xlsx- Retrieving missing dois results in: megameta_asreview_doi_retrieved.xlsx- Deduplication based on doi and a conservative deduplication strategy results in:megameta_asreview_deduplicated.xlsx- The quality of the labels is checked in two stages. The first stage checks for falsely excluded recordsand the second for falsely included records. Together they result inmegameta_asreview_quality_checked.xlsx//If you want access to the database, please contact Dr. Brouwer: m.e.brouwer@amsterdamumc.nl

Date Submitted: 2022-08-19

Issued: 2022-07-15

Identifier
DOI https://doi.org/10.17026/dans-29d-n6yg
Metadata Access https://lifesciences.datastations.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.17026/dans-29d-n6yg
Provenance
Creator M.E. Brouwer ORCID logo
Publisher DANS Data Station Life Sciences
Contributor M.E. Brouwer; L. Hofstee (Utrecht University); S. van den Brand (Utrecht University); J. Teijema (Utrecht University); V. Melnikov (University of Amsterdam); G. Ferdinands (Utrecht University); B. Kramer (Utrecht University); J. de Boer (Utrecht University); F. Weijdema (Utrecht University); P. Lucassen (University of Amsterdam); P. Sloot (University of Amsterdam); K. Stronks (Amsterdam UMC); J. van Weert (University of Amsterdam); R. Wiers (University of Amsterdam); C. Bockting; R. van de Schoot (Utrecht University); J. de Bruin (Utrecht University)
Publication Year 2024
Rights DANS Licence; info:eu-repo/semantics/restrictedAccess; https://doi.org/10.17026/fp39-0x58
OpenAccess false
Contact M.E. Brouwer (Amsterdam UMC)
Representation
Resource Type Dataset
Format application/octet-stream; application/vnd.openxmlformats-officedocument.spreadsheetml.sheet; text/plain; text/csv; application/zip
Size 135776808; 37337; 15743; 4403831; 38697926; 219698651; 101490337; 118054318; 1582257; 38470986; 99931889; 146117209; 46602788; 265870636; 201506527; 40893; 18854; 3661719; 1993571; 100367042; 495017768; 257416002; 66332; 135766733; 188447268; 172178865; 275; 4368; 295; 5949; 162496126; 46455; 1703695; 243030860; 119744392; 36269; 47567271; 3155750
Version 1.0
Discipline Life Sciences; Medicine