Hi, I am a PhD student who is still pretty new to Bioinformatics.
I have a question regarding the merging paired-end reads for analysis with MEGAN.
I have paired-end shotgun data (150bp) with a significant amount of adapter readthrough pointed out to me here. I have a couple of questions about the best way to do this. In some instances, after adapter removal, I have up to ~70% of my reads with "Forward Only Surviving". I am unsure about the best way to go about merging overlapping reads.
- Am I better to use a tool like
bbmerge.sh
/bbmap
on both sets of "paired" reads and then simplycat
the forward only surviving reads into the end.
- Re-run
trimmomatic
with <Keepbothreads> parameter set as true and perform the merge on the output files.
I'm very unsure if there is a best practice for this process. I have a traditional micro background and whilst I'm enjoying dipping my toe into bioinformatics it seems to be a bit of a minefield.
I guess my alternative approach is simply to take my R1 reads and treat them as single-end data. I don't know if merging R1 / R2 would have a great deal of use for taxonomic/functional analysis or is more suitable for assembly? At the moment I'm running my R1 data through DIAMOND
against the NCBI-nr database in the background whilst considering this.
Thanks