Question: Formula for extracting per base read depth/per base genome coverage from a SAM file?
0
gravatar for Diploid Progenitor
16 months ago by
Diploid Progenitor10 wrote:

Hello,

Is anyone aware of a way to extract per base read depth/per base genome coverage directly from a SAM file without making use of applications such as bedtools or SAMtools?

As in, which columns in the SAM file contain the relevant information required to calculate per base read depth, and how would they be processed?

I have tried calculating read depth from POS and TLEN, but encountered negative mapping intervals for some reads at the start of chromosomes (e.g: POS = 10, TLEN = -150).

Any suggestions?

Kind Regards

ADD COMMENTlink modified 16 months ago by Devon Ryan92k • written 16 months ago by Diploid Progenitor10
2

It there anything wrong with SAMtools? SAMtools depth is exactly what you need.

ADD REPLYlink written 16 months ago by ATpoint23k

Hi, I haven't tried SAMtools, but have been using bedtools genomecov, which works for me. I am simply curious as to how these tools extract the relevant information from the BAM/SAM format, and if there is a straightforward formula that they use to do so.

ADD REPLYlink written 16 months ago by Diploid Progenitor10
2
gravatar for Devon Ryan
16 months ago by
Devon Ryan92k
Freiburg, Germany
Devon Ryan92k wrote:

Sure you can do that by reimplementing things like samtools depth or bedtools genomecov. The better question is then why you would want to do that. Such programs are widely available and more heavily tested than you're likely to do with your own code.

BTW, if you need this for integration into another package, have a look at pysam, htslib, or htsjdk, depending on the programming language you're using. These all provide API access to the same functionality.

ADD COMMENTlink written 16 months ago by Devon Ryan92k

Hello Devon,

Thank you for your response.

I am currently using bedtools and it is working just fine. I was just curious as to how this tool and SAMtools extract the relevant information from the BAM/SAM format, i.e. if there is a straightforward formula being applied here could easily be emulated.

Regards,

ADD REPLYlink written 16 months ago by Diploid Progenitor10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 824 users visited in the last hour