Sorting things out-assessing effects of unequal specimen biomass on DNA metabarcoding
Environmental bulk samples often contain many different taxa that vary several orders of magnitude in biomass. This can be problematic in DNA metabarcoding and metagenomic high‐throughput sequencing approaches, as large specimens contribute disproportionately high amounts of DNA template. Thus, a few specimens of high biomass will dominate the dataset, potentially leading to smaller specimens remaining undetected. Sorting of samples by specimen size (as a proxy for biomass) and balancing the amounts of tissue used per size fraction should improve detection rates, but this approach has not been systematically tested. Here, we explored the effects of size sorting on taxa detection using two freshwater macroinvertebrate bulk samples, collected from a low‐mountain stream in Germany. Specimens were morphologically identified and sorted into three size classes (body size < 2.5 × 5, 5 × 10, and up to 10 × 20 mm). Tissue powder from each size category was extracted individually and pooled based on tissue weight to simulate samples that were not sorted by biomass (“Unsorted”). Additionally, size fractions were pooled so that each specimen contributed approximately equal amounts of biomass (“Sorted”). Mock samples were amplified using four different DNA metabarcoding primer sets targeting the Cytochrome c oxidase I (COI) gene. Sorting taxa by size and pooling them proportionately according to their abundance lead to a more equal amplification of taxa compared to the processing of complete samples without sorting. The sorted samples recovered 30% more taxa than the unsorted samples at the same sequencing depth. Our results imply that sequencing depth can be decreased approximately fivefold when sorting the samples into three size classes and pooling by specimen abundance. Even coarse size sorting can substantially improve taxa detection using DNA metabarcoding. While high‐throughput sequencing will become more accessible and cheaper within the next years, sorting bulk samples by specimen biomass or size is a simple yet efficient method to reduce current sequencing costs.