Samtools mpileup of soft-clipped alignment behaves non-deterministically?
1
1
Entering edit mode
6.8 years ago
iovercast ▴ 10

I have a sorted, indexed bam of soft-clipped reads. I'm using samtools mpileup and the pileup file it's generating is malformed. For example I see a lot of regions like this, where the # of aligned reads at a position decrements without a corresponding read termination character ($): 1 3008463 T 7 .,,.... AGB8A8> 1 3008464 G 6 .,,.$..\$        3GB8<8


Oops! I also see plenty of reads that do not have start tokens. Now, granted I'm stress testing it and feeding it really nasty, messy soft-clipped data, so I'm not surprised about the garbage in/garbage out behavior, but if mpileup does not "behave" with soft clipped data I'd like to document this more prominently somewhere, perhaps in the docs for mpileup.

Alright, anybody with any thoughts on this?

-isaac

alignment • 2.7k views
5
Entering edit mode
6.8 years ago

The start and end symbols will not appear if the first (last) shown base of a read is filtered out. Have you tried with mpileup -Q0 to show all the bases? By default, bases with base quality < 13 are not displayed.

0
Entering edit mode

Nailed it. Thank you sir.

Traffic: 875 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.