Question: Samtools Mpileup Snp Calling Improperly Paired Reads
0
gravatar for rob234king
7.0 years ago by
rob234king600
UK/Harpenden/Rothamsted Research
rob234king600 wrote:

I'm doing samtools mpileup snp calling but samtools is not including reads that are improperly paired or mates unmapped when I examine the snps using the bam file. Due to the nature of the reference I have I want to turn this behavior off. Turning quality scores filters for base and reads do not have an effect. Any ideas how to do this, or a snp caller that can set to include reads that are improperly paired to the snp calling process?

samtools mpileup -I -q 20 -Q 20 -DSuf Triticum_aestivum.IWGSP1.21.dna.genome.fa A6_P33_Triticum_aestivum.IWGSP1.20.dna.genome_novo_H15_t60.sort.rmdup.bam C6_P33_Triticum_aestivum.IWGSP1.20.dna.genome_novo_H15_t60.sort.rmdup.bam D3_P33_Triticum_aestivum.IWGSP1.20.dna.genome_novo_H15_t60.sort.rmdup.bam| bcftools view -bvcg - > A6C6D3_P33_Triticum_aestivum.IWGSP1.20.dna.genome_novo_H15_t60.sort.rmdup.bcf

bcftools view A6C6D3_P33_Triticum_aestivum.IWGSP1.20.dna.genome_novo_H15_t60.sort.rmdup.bcf | ./vcfutils.pl varFilter -D100 > A6C6D3_P33_Triticum_aestivum.IWGSP1.20.dna.genome_novo_H15_t60.sort.rmdup.vcf
samtools mpileup • 6.0k views
ADD COMMENTlink modified 7.0 years ago by Andreas2.5k • written 7.0 years ago by rob234king600
3
gravatar for Andreas
7.0 years ago by
Andreas2.5k
Singapore
Andreas2.5k wrote:

Hi,

The sam flag filter description that ashutoshmits listed are correct. However, samtools mpileup will by default ignore what it calls "anomalous read pairs" or "orphans". Those are paired reads where the mate is not paired. This behaviour can be toggled with

       -A           count anomalous read pairs

If you're interested in digging deeper: the corresponding source code hides in bam_plcmd.c:

       else if ((ma->conf->flag&MPLP_NO_ORPHAN) && (b->core.flag&1) && !(b->core.flag&2)) skip = 1;

i.e. if orphans are to be ignored (default, if -A is not given) then reads will be ignored if they are paired (b->core.flag&1) but are not a proper pair !(b->core.flag&2).

Andreas

PS Edit: -A is all you need. No need to modify the source code.

ADD COMMENTlink modified 7.0 years ago • written 7.0 years ago by Andreas2.5k

Thanks Andreas for such a great answer.

ADD REPLYlink written 7.0 years ago by Ashutosh Pandey12k

So just to check I understand correctly if I want it to include anomalous read pairs I use the -A option and delete "&& (b->core.flag&1) && !(b->core.flag&2)" to remove the skipping of improperly paired reads?

ADD REPLYlink written 7.0 years ago by rob234king600

Yes you are right. You may have to do "make samtools" again in order for that change to take place.

ADD REPLYlink written 7.0 years ago by Ashutosh Pandey12k

Thanks a lot, should be a big help.

ADD REPLYlink written 7.0 years ago by rob234king600

No sorry, that's not what I wanted to say. Just specifying -A is enough to include orphan reads!

ADD REPLYlink modified 7.0 years ago • written 7.0 years ago by Andreas2.5k

It worked by doing what I said but if same can be done just with the -A option, would be more reproducible for others, I'll check it thanks.

ADD REPLYlink written 7.0 years ago by rob234king600

My bad. I thought you asked to change the code. But the code was for the purpose of explaining the parameter.

ADD REPLYlink written 7.0 years ago by Ashutosh Pandey12k
1
gravatar for Ashutosh Pandey
7.0 years ago by
Philadelphia
Ashutosh Pandey12k wrote:

I am not sure but it may be the case that your mate-pair reads that are improperly paired or mate-pair reads with only one read mapped are being discarded due to one of the following reasons.

The default settings for mpileup is to filter reads with bitwise flag 0X704.

So for pileup generation the following reads will not considered at all from the bam files:

1) 0x0400 (aka 1024 or "d") duplicate

2) 0x0200 (aka 512 or "f") failed QC

3) 0x0100 (aka 256 or "s") non primary alignment

4) 0x0004 (aka 4 or "u") unmapped

I think in your case, the improperly paired reads and mate-pairs with only one read mapped may belong to non-primary alignment or duplicates. I am just guessing. Unfortunately, you can't tell mpileup which reads to choose from command line but you can modify the code as explained in this post (http://seqanswers.com/forums/showthread.php?t=22752).

ADD COMMENTlink written 7.0 years ago by Ashutosh Pandey12k

Thanks very much ill give it a try.

ADD REPLYlink written 7.0 years ago by rob234king600
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1453 users visited in the last hour
_