Question: remove primer sequences from BAM file
gravatar for J.F.Jiang
8 months ago by
J.F.Jiang750 wrote:

Hi all,

I am dealing with amplicon data, which was obtained from multiplex PCR.

Original primers can largely increase the mapping accuracy, since some amplicons are highly homologous with little difference within primer region.

And since these primers may introduce FAKE SNPs when they are within inserted regions from other amplicons, we want to remove the primers from BAM file. can clip the primer sequence, however, it is time-consuming when amplicon is sequenced more than 1000X.

GATK can softly clip the sequence. However, it can also clip the sequence that is similar with the primers, especially those are not the real primers, but the insert sequence.

I am wondering if there is any tool that can quickly clip the real primers.

Best, Junfeng

primer bam • 432 views
ADD COMMENTlink written 8 months ago by J.F.Jiang750

An alternative solution would be to use the -L argument of GATK to specify variant calling to a certain region (bed file) within the amplicons, excluding the primers.

ADD REPLYlink written 8 months ago by WouterDeCoster38k


Take the scenario above as consideration, if position * is always T, but C in the primer region of the second amplicon. The calling result will give T/C SNP, however, the C allele is just the PCR error or synthesis error. If we remove the primer sequence of the amplicons, the final calling will be T/T as expected.

ADD REPLYlink modified 8 months ago by WouterDeCoster38k • written 8 months ago by J.F.Jiang750

Oh, so you mean you have overlapping amplicons? Right... then my suggestion doesn't work.

ADD REPLYlink written 8 months ago by WouterDeCoster38k

yes, -L option can handle with non-overlapped regions.

ADD REPLYlink written 8 months ago by J.F.Jiang750
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1365 users visited in the last hour