Question: bam processing before variant calling
1
gravatar for user230613
4.1 years ago by
user230613280
Europe
user230613280 wrote:

Hi all,

I've a general question about if it is recommended to process bam file before variant calling process. I'm going to use samtools for the call and my question is if I should make some treatments to bam file before that. For example: remove not primary alignment reads (flag 256), remove suplementary alignments (flag 2048), get only those reads mapped in proper pair (flag 2), or remove reads by quality 30...
Are these steps recommended before variant calling? I've read that PCR duplicates must be removed before the call, but I'm curious about if I should apply these other "filters".

Thanks in advance

variant calling • 2.2k views
ADD COMMENTlink modified 4.1 years ago by Fabio Marroni2.2k • written 4.1 years ago by user230613280

FYI, samtools mpileup will already ignore marked duplicates and supplementary alignments. The remainder of what you mentioned can simply be passed in as options, rather than needing to explicitly preprocess things.

ADD REPLYlink written 4.1 years ago by Devon Ryan90k

Hi Devon, do you mean that mpileup intrinsecally will discard (or not take into account) those reads with not primary flag, or suplementary flag, or proper pair... Or should I specify to mpileup these conditions as argumets?
Thanks
 

ADD REPLYlink written 4.1 years ago by user230613280
2

It defaults to ignoring those, unless you explicitly instruct it otherwise (see the --ff option). There's no need to do any of the preprocessing you mentioned with samtools mpileup. For the mapq score, just specify -q 30. You can also tell samtools to only use properly paired reads by specifying --rf 2.

ADD REPLYlink written 4.1 years ago by Devon Ryan90k
1
gravatar for Fabio Marroni
4.1 years ago by
Fabio Marroni2.2k
Italy
Fabio Marroni2.2k wrote:

It depends on what kind of variants you want to call.

I will give you my opinion for SNPs and PAVs, assuming you have very high coverage. For CNVs I usually use the same rules as SNPs.

 SNPs:

- remove not primary alignment reads: YES

- remove suplementary alignments: YES

- get only those reads mapped in proper pair: YES

- remove reads by quality: YES (but I think most SNP callers would do that anyway)

PAVs:

- remove not primary alignment reads: YES

- remove suplementary alignments: NO

- get only those reads mapped in proper pair: NO

- remove reads by quality: YES (but I think most SNP callers would do that anyway)

Hope this helps

 

ADD COMMENTlink written 4.1 years ago by Fabio Marroni2.2k

Thank you Fabio. Sorry I've not mentined in the question, I'm interested in calling SNPs and indels. For indels the steps are the same as in SNPs?

ADD REPLYlink written 4.1 years ago by user230613280

For small indels (few bp), yes.

ADD REPLYlink written 4.1 years ago by Fabio Marroni2.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 744 users visited in the last hour