I'm analyzing paired-end TruSeq Custom Amplicon panel data so I need to get rid of primer sequences (ULSO and DLSO probes) prior to variant calling. As far as I know the preferable way to do it is soft-clipping. Please advise the proper tools for this purpose. Which ones do you prefer and why?
I have found only 2 such tools that use different approaches: GATK's ClipReads and BamClipper. GATK's ClipReads can soft-clip sequence by exact match, while BamClipper uses genomic coordinates of primer sequences. I guess, the second approach is more accurate because it is not affected by possible base calling issues that could introduce erroneous bases in the ULSO/DLSO sequences. Correct me if I'm wrong.
What do you think about those approaches for handling primer sequences when analyzing targeted panels?
So far, I've tried soft-clipping primer sequences with BamClipper but somehow it has introduced errors into the BAM files, and I'm still waiting to get an answer from the developer at GitHub.
Thanks for your time!
UPDATE for those interested in the topic - I've found more tools: cutPrimers (but for now it can be run on FASTQ files only); Katana that uses the same principle as BamClipper; PcrClipReads.
I am facing similar issues. Can you please share your latest updates and how you resolved it. Also, would it be possible to explain how to get a BEDPE file from Probes in the manifest file(I am not sure if all the fields mentioned in the bedtools link are present http://bedtools.readthedocs.io/en/latest/content/general-usage.html#bedpe-format ),
I would really appreciate your inputs.
In case this is useful, I am using bwa for alignment with default options, samtools and scripts for indel re-alignment and variant calling using varscan.
Thanks and Regards, Pramod