Rsamtools mpileup discrepancy
0
0
Entering edit mode
4.3 years ago
Noushin N ▴ 600

Hi everyone,

I was wondering if you can share any tips to mitigate the following issue. I am interested in getting the count of reads carrying the reference and alternate reads at a number of genomic positions, which is a straightforward mpileup question. However, I have noticed that when I run the mpileup command from the command line I get a different answer compared to Rsamtools pileup function. After some drilldown, the discrepancy appears to be due to the following two parameters:

1) The command line samtools discards orphan reads (anomalous read pairs) by default. Rsamtools pileup does not seem to do so.

2) The command line samtools performs read pair overlap detection to avoid double counting read pairs. It looks like Rsamtools does not.

My question is whether there is a way to make Rsamtools behavior match the command line samtools.

Cheers!

samtools R • 1.2k views
ADD COMMENT
0
Entering edit mode

This is probably not what you want but wouldn't it be the easiest to simply run normal mpileup, be it from inside R or command line and then load this into R?? It can be difficult to make sure two tools perform identical covering all edge cases so I wonder if it is worth the effort.

ADD REPLY
0
Entering edit mode

Thank you for your input. I have considered this possibility as well. However, given that Rsamtools is supposedly "an interface to the samtools" , I wonder why the default behavior is different between the two and how we can make them match.

ADD REPLY

Login before adding your answer.

Traffic: 1436 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6