Hi, I've read that it is possible to pool all fastq files of Illumina reads, of different samples for metabarcoding, into one file, and then continue with analysis. I've even heard that you can combine reads from other runs, if they are from the same environment. Now, I would think that if you are going to combine samples from different runs, they should also be from the same sequencing depth, or the relative abundance estimations of biological sequences will be all wrong. Could anyone comment? Thanks.
if you plan to mark the duplicates , reads from different flowcell/lane/library cannot be considered as an optical/pcr duplicate So you'll have to assign a distinct read-group for each of those conditions. Furthermore, tools like BWA use a subset of reads to calculate the average segment length.