Supplementary Material for "Process Data Properties Matter: Introducing Gated Convolutional Neural Networks (GCNN) and Key-Value-Predict Attention Networks (KVP) for Next Event Prediction with Deep Learning"

DOI

Supplementary material for the article: Heinrich, Kai ; Zschech, Patrick ; Janiesch, Christian ; Bonin, Markus: Process Data Properties Matter: Introducing Gated Convolutional Neural Networks (GCNN) and Key-Value-Predict Attention Networks (KVP) for Next Event Prediction with Deep Learning. In: Decision Support Systems, 2021.

Abstract: "Predicting next events in predictive process monitoring enables companies to manage and control processes at an early stage and reduce their action distance. In recent years, approaches have steadily moved from classical statistical methods towards the application of deep neural network architectures, which outperform the former and enable analysis without explicit knowledge of the underlying process model. While the focus of prior research is on the long short-term memory network architecture, more deep learning architectures offer promising extensions that have proven useful for other applications of sequential data. In our work, we introduce a gated convolutional neural network and a key-value-predict attention network to the task of next event prediction. In a comprehensive evaluation study on 11 real-life benchmark datasets, we show that these two novel architectures surpass prior work in 34 out of 44 metric-dataset combinations. For our evaluation, we consider the effects of process data properties, such as sparsity, variation, and repetitiveness, and discuss their impact on the prediction quality of the different deep learning architectures. Similarly, we evaluate their classification properties in terms of generalization and handling class imbalance. Our results provide guidance for researchers and practitioners alike on how to select, validate, and comprehensively benchmark (novel) predictive process monitoring models. In particular, we highlight the importance of sufficiently diverse process data properties in event logs and the comprehensive reporting of multiple performance indicators to achieve meaningful results."

Data is available under Creative Commons Attribution-ShareAlike (CC-BY-SA) http://creativecommons.org/licenses/by-sa/4.0/

Software code is available under GNU General Public License (GNU GPL v3) http://www.gnu.org/licenses/gpl-3.0.html

Using this data for academic publications is granted explicitly.

The dataset was created jointly by researchers working at the Technische Universität Dresden and TU Dortmund University.

Identifier
DOI https://doi.org/10.23728/b2share.08b7ff704f724b94a61b4a6cac0fe1e0
Source https://b2share.eudat.eu/records/08b7ff704f724b94a61b4a6cac0fe1e0
Metadata Access https://b2share.eudat.eu/api/oai2d?verb=GetRecord&metadataPrefix=eudatcore&identifier=oai:b2share.eudat.eu:b2rec/08b7ff704f724b94a61b4a6cac0fe1e0
Provenance
Creator Heinrich, Kai; Zschech, Patrick; Janiesch, Christian; Bonin, Markus
Publisher EUDAT B2SHARE; Julius-Maximilians-Universität Würzburg, Technische Universität Dresden
Publication Year 2021
Rights GNU General Public License 3 (GPL-3.0); info:eu-repo/semantics/openAccess
OpenAccess true
Contact kai.heinrich(at)tu-dresden.de
Representation
Language English
Resource Type Dataset
Format xlsx; zip; docx; txt
Size 214.7 kB; 4 files
Version 1
Discipline 5.3.10.1 → Information systems → Management information systems; 5.6.23 → Engineering → Computer Science