Samtools mpileup of soft-clipped alignment behaves non-deterministically?
1
1
Entering edit mode
8.4 years ago
iovercast ▴ 10

I have a sorted, indexed bam of soft-clipped reads. I'm using samtools mpileup and the pileup file it's generating is malformed. For example I see a lot of regions like this, where the # of aligned reads at a position decrements without a corresponding read termination character ($):

1       3008463 T       7       .,,.... AGB8A8>
1       3008464 G       6       .,,.$..$        3GB8<8

Oops! I also see plenty of reads that do not have start tokens. Now, granted I'm stress testing it and feeding it really nasty, messy soft-clipped data, so I'm not surprised about the garbage in/garbage out behavior, but if mpileup does not "behave" with soft clipped data I'd like to document this more prominently somewhere, perhaps in the docs for mpileup.

Alright, anybody with any thoughts on this?

-isaac

alignment • 3.4k views
ADD COMMENT
5
Entering edit mode
8.4 years ago

The start and end symbols will not appear if the first (last) shown base of a read is filtered out. Have you tried with mpileup -Q0 to show all the bases? By default, bases with base quality < 13 are not displayed.

ADD COMMENT
0
Entering edit mode

Nailed it. Thank you sir.

ADD REPLY

Login before adding your answer.

Traffic: 2601 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6