The Colorectal_Cancer_IHC_CISH_HE_Epithelium_Segmentation dataset

DOI

This comprehensive dataset provides high-resolution histopathological images and annotations for AI training in colorectal cancer analysis. The collection includes Tissue Microarray (TMA) cores from 100 patients, featuring both normal colorectal mucosa and cancer tissue. The dataset contains two main components: a full dataset with triplicate cores and 13 protein markers.

TMA Cores: Normal Colorectal Mucosa: 3 cores per patient (one core per image file and corresponding epithelium segmentation mask file) Colorectal Cancer: 3 cores per patient (one core per image file and corresponding epithelium segmentation mask file)

Markers: Hematoxylin and eosin stained cores. Immunohistochemistry for 13 Proteins: E-Cadherin, Vimentin, Smooth Muscle Actin (SMA), Ki-67, SMAD3, MACC1, LASP1, CD44, NAIP, KLF5, FSCN1, CTNND1, and KRAS ISH stains for two miRNAs (miR-143, miR-145), ISH Positive Control (U6 snRNA) and ISH Negative Controls (Scrambled probe)

Each image file contains one core with mask files with fully annotated epithelium (both normal and cancer epithelial tissue) for deep learning-based segmentation analysis. The dataset includes original microscopy images and segmentation masks, as well as quantitative measurements (SPSS datafile) for a subset of the markers.

Identifier
DOI https://doi.org/10.18710/DIGQGQ
Metadata Access https://dataverse.no/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18710/DIGQGQ
Provenance
Creator Pettersen, Henrik Sahlin ORCID logo; Wiik, Erik Nesje ORCID logo
Publisher DataverseNO
Contributor Pettersen, Henrik Sahlin; Faculty of Medicine and Health Sciences; NTNU – Norwegian University of Science and Technology
Publication Year 2025
Rights CC0 1.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/publicdomain/zero/1.0
OpenAccess true
Contact Pettersen, Henrik Sahlin (Consultant Pathologist / Associate Professor, St. Olav's Hospital / NTNU – Norwegian University of Science and Technology, Trondheim, Norway.)
Representation
Resource Type Histopathology 40x image scans (one TMA core per image) and corresponding epithelium segmentation mask files.; Dataset
Format text/plain; application/zip
Size 6483; 16293019448; 16400703873; 18455670337; 17249109719; 18941510930; 16133520010; 19589198914; 17908930329; 18953915808; 17325645685; 89612679466; 89686240529; 17400667362; 22556183176; 17559760662; 16755933375; 31612; 23053045812; 17095274924
Version 1.0
Discipline Life Sciences; Medicine
Spatial Coverage Trondheim, Norway