Calculate coverage
Entering edit mode
4 months ago
daewowo ▴ 20

I realise this has been asked a lot, but I havent found a tool that fits my simple needs.

I would like to calculate coverage % for an alignment file (sam ot bam) against a reference fasta genome. (NB there may be mutiple aligned sequences in the sam/bam file).

bedtools can calculate coverage - but splits this up and I cant work out how to get a single % value for coverage - other than writing a script to parse the output. Ditto for samtools mpileup - this generates a huge ammount of information, whereas I am after a single percentage.

One basic way to do this is plot in IGV, extract consensus, then take the number of N's divided by reference genome length. There maybe existing tools that can do this in one command.

coverage • 284 views
Entering edit mode
4 months ago

Unfortunately, nobody ever defines what they mean by 'coverage'. Can you please explain what you need such that another person's interpretation of what you mean by coverage can be made beyond a reasonable doubt?

Do you mean the percent of the reference genome that has at least 1 aligned read? - if 'yes', see my code, here: Determine % of reference genome covered by aligned SAM/BAM



Login before adding your answer.

Traffic: 2871 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6