I'm trying to work with some Illumina shotgun metagenomic reads (2x150bp). I've tried merging both the forward and reverse reads with BBmerge and PEAR but both tools only merge about 30% of the reads at the most.
Would I be right in assuming that this is due to the shotgun shearing producing some larger inserts where the forward and reverse reads never actually overlap?
If this is the case, would there be any benefit to merging the reads before DIAMOND analysis, or would just processing Read 1 and Read 2 separately be preferred?
In a protocol for DIAMOND and MEGAN analysis here, the suggest merging paired end reads using fastq-join (which I assume would give similar results to BBmerge and PEAR) and then concatenating the merged reads as well as the unmerged reads together to ensure all of the data is retained.
- What would be the benefit of merging the reads at all if they are just getting combined with the unmerged reads anyway before analysis (other than having a single input file for DIAMOND)?