I have PacBio reads that need to be assembled. These reads have Illumina primers at the both ends as well as in the middle. The problem is that the primer sequences vary and standard trimming cannot remove all the primers in the reads. My lab wants the assembled genome with the best quality, so I might have to write a script to detect the primers in the middle. I am currently thinking that I might want to remove sequences that are 80 ~ 100% similar to the primer sequences. But I am worried that this would also get rid of some informative sequences of the genome.
How do you guys deal with such situations?
Thank you in advance!