hi folks,
I have some metagenomics data that was generated using an amplification protocol that integrated a known primer sequence into random sites throughout genomes in the sample. I'd like to selectively remove the primer sequence and keep all flanking regions of the reads, a la deleting the primer sequence, but not trimming the read, which is done separately.
An example:
Raw read
sequence1ADAPTERsequence2
Desired result:
sequence1sequence2
I haven't figured out a way to do this with cutadapt (though would love any advice I've missed) and I know BBDuk can mask the adapter sequence with Ns, but that isn't exactly what I'm looking for either. Any suggestions welcome!
thanks!
If you have input fastq files, you should be able to do using
cutadapt
-Reference
This
ADAPTER
is in the middle of the read. OP wants to remove that and join the left and right pieces.See if this helps: https://bioinf.shenwei.me/seqkit/usage/#amplicon
No standard scan/trim program is likely going to do this. Generally all sequence 3' of where the initial match is made to adapter is trimmed.