This dataset represents 103 labelled approx. 60x60 meters shapefiles representing different forest types. Seventy-nine sites were visited forest plots during AWI field expeditions in summer 2018 in Chukotka and Western Yakutia, and in summer 2021 in Central Yakutia, Siberia, Russia. Twenty-four additional sites were identified using photo-interpretation of Google Earth Imagery© and Sentinel-2 imagery with regional expert knowledge. Each plot corresponds to a single shapefile, with a single label.The labels are the following: Larix woodland, class name 'Sparse Larch' (Class 0), open Larix forest, class name 'Medium Larch' (Class 1), closed Larix forest, class name 'Dense Larch' (Class 2), needle-leaf evergreen forest (Pinus, Picea), class name 'Evergreen' (Class 3), mixed broad-leaf and needle-leaf Summergreen forest, class name 'Mixed Summergreen' (Class 4), mixed needle-leaf and broadleaf Summergreen and Evergreen forest, class name 'Mixed Summergreen-Evergreen' (Class 5), and 'Burnt or Barre' (Class 6). We defined the labels based on the percentage of tree species present at the site from field visits (Kruse et al. 2019, Morgenstern et al. 2023), and the percentage of crown cover covering the plot following this protocol:• When the forest plot consisted of one tree species, the label consists of this species only. E.g. 60% Larix crown cover→ Label = Larch (sparse, medium, or dense: classes 0,1,2)• When the plot consisted of two tree species with one of less than 10%, the label was assigned to the dominating species. E.g. 60% Larix & 5% Pinus → Label = Larch Larch (sparse, medium, or dense: classes 0,1,2)• When the plot consisted of multiple tree species with comparable coverage (relative difference <20%), the label was assigned as mixed forest. There are 2 cases: 'Mixed Summergreen' label (e.g. Larix and Betula), class 4, and 'Mixed Summergreen-Evergreen' label (e.g. Larix and Picea), class 5.Due to the abundance of larches in this region and having access to crown cover data, we created more detailed labels for them. For the larch forest crown cover, we used the in-situ data estimated on the field, and for validation, we calculated the percentage of tree crown cover based on LiDAR point clouds recorded in 2021 with a Mapper (Yellowscan) carried by a M300 DJI drone covering each vegetation plot, and structure from motion from the built-in RGB camera carried by the DJI Phantom 4 recorded in 2018 (Brieger et al. 2019). From the pointclouds, we extracted the mean tree crown cover percentage (for trees higher than 2 meters) from a Canopy Height Model for the plot area, and used it to validate our crown cover percentage field estimations. Both had similar values varying between 0% (meaning no tree), and 100% (meaning the surface is entirely covered by tree canopy). We identified three categories: Sparse Larix (Crown cover 50%). To enrich and balance the training dataset, we added a total of 24 plots corresponding to 'Evergreen' and 'Mixed Summergreen-Evergreen' labels that are chosen with expert knowledge using Sentinel-2 late summer Normalized Difference Vegetation Index and Google Earth imagery©.