This comprehensive dataset provides high-resolution histopathological images and annotations for AI training in colorectal cancer analysis. The collection includes Tissue Microarray (TMA) cores from 100 patients, featuring both normal colorectal mucosa and cancer tissue. The dataset contains two main components: a full dataset with triplicate cores and 13 protein markers.
TMA Cores:
Normal Colorectal Mucosa: 3 cores per patient (one core per image file and corresponding epithelium segmentation mask file)
Colorectal Cancer: 3 cores per patient (one core per image file and corresponding epithelium segmentation mask file)
Markers:
Hematoxylin and eosin stained cores.
Immunohistochemistry for 13 Proteins: E-Cadherin, Vimentin, Smooth Muscle Actin (SMA), Ki-67, SMAD3, MACC1, LASP1, CD44, NAIP, KLF5, FSCN1, CTNND1, and KRAS
ISH stains for two miRNAs (miR-143, miR-145), ISH Positive Control (U6 snRNA) and ISH Negative Controls (Scrambled probe)
Each image file contains one core with mask files with fully annotated epithelium (both normal and cancer epithelial tissue) for deep learning-based segmentation analysis. The dataset includes original microscopy images and segmentation masks, as well as quantitative measurements (SPSS datafile) for a subset of the markers.