Metabarcoding has become the workhorse of community ecology. Sequencing a taxonomically informative DNA fragment from environmental samples gives fast access to community composition across taxonomic groups, but it relies on the assumption that the number of sequences for each taxon correlates with its abundance in the sampled community. However, gene copy number varies among and within taxa, and the extent of this variability must therefore be considered when interpreting community composition data derived from environmental sequencing. Here we measured with single-cell qPCR the SSU rDNA gene copy number of 139 specimens of five species of planktonic foraminifera. We found that the average gene copy number varied between of ~4 000 to ~50 000 gene copies between species, and individuals of the same species can carry between ~300 to more than 350 000 gene copies. This variability cannot be explained by differences in cell size and considering all plausible sources of bias, we conclude that this variability likely reflects dynamic genomic processes acting during the life cycle. We used the observed variability to model its impact on metabarcoding and found that the application of a correcting factor at species level may correct the derived relative abundances, provided sufficiently large populations have been sampled.
Supplementary Material 1. Geographic origin, taxonomic identification and SSU rDNA gene copy number quantification obtain with qPCR for the 139 specimens. Sequences are provided for the fragment 45E-47F.