Question: [samtools depth] Calculate percent of the genome covered
1
gravatar for madkitty
4.9 years ago by
madkitty580
Canada
madkitty580 wrote:

I received a BAM of one chromosome and need to determine the percent of the genome covered. Our genome size is 16,600 bp and we did paired-end sequencing each reads being 150 bp long. To my knowledge it should be calculated as follow:

 

  • Genome size : 16,670 bp
  • No of base covered: 16,639 bp
  • Percent of the genome covered = [(No. of base covered) / (Genome size) ] *100 = (16,639/16,670) * 100 = 99.81%

I did the following with samtools depth: 

  1. samtools depth -q0 -Q0 my.bam > qry.depth
  2. wc -l qry.depth
  3. 16639 qry.depth

Though I don't fully understand the output of samtools depth. To my knowledge the output should be as follow:

  • Column 1: ID
  • Column 2: Genomic position
  • Column 3: No of reads covering that position
  • Number of line in qry.depth represents the number of bases covered in the reference

Here is a sample of my qry.depth file: 

chrM 1 3
chrM 2 6
chrM 3 16
chrM 4 32
chrM 5 34
chrM 7783 0
chrM 7915 0

 

The line count of qry.depth is 16,639 but I still have 2 positions with no reads covering them,  then what is the percent of the genome covered?? Maybe I misunderstood something in the output file format..

coverage samtools bam genome • 10k views
ADD COMMENTlink modified 4.9 years ago by Devon Ryan89k • written 4.9 years ago by madkitty580
0
gravatar for Devon Ryan
4.9 years ago by
Devon Ryan89k
Freiburg, Germany
Devon Ryan89k wrote:

It'll output a 0 if it would have otherwise produced a different number, had you not specified -q. Yes, it would be nice if this were documented.

ADD COMMENTlink written 4.9 years ago by Devon Ryan89k

Hi Devon,

Sorry. I cannot understand completely. I also encounter the question:

The reference length: 134962

Line count of *.depth is : 134184

No. of positions have only 0: 26, as below:

Triodia_melvillei    67472    0    0    0

Triodia_melvillei    67473    0    0    0

Triodia_melvillei    67474    0    0    0

Triodia_melvillei    67475    0    0    0

Triodia_melvillei    67476    0    0    0

Triodia_melvillei    110232    0    0    0

Triodia_melvillei    110233    0    0    0

Triodia_melvillei    110234    0    0    0

Triodia_melvillei    110235    0    0    0

Triodia_melvillei    110236    0    0    0

Triodia_melvillei    110237    0    0    0

Triodia_melvillei    110238    0    0    0

Triodia_melvillei    110239    0    0    0

My question is: why there are 0? If there are reads to support these positions, why samtools output them?

Thanks! 

 

ADD REPLYlink written 3.2 years ago by 216728410

Normally "-a -a" will output absolutely every position, or at least it should. If there are reads supporting coverage of a positions then they should get counted unless they're above the maximus depth (-d) or being filtered out (play around with -q and -Q).

ADD REPLYlink written 3.2 years ago by Devon Ryan89k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 976 users visited in the last hour