Does build 37 of the human reference genome contain 2.85Gbp?
1
2
Entering edit mode
9.2 years ago

When converting from base count to coverage in this document they divide by 2.85 billion nucleotides:

POP BASE COUNT    COVERAGE      
ACB 3182037349066 1116.50433300561

Does build 37 of the human reference genome still only contain 2.85Gbp? It's the same figure as in build 35. In my reference sequence I count 3.1Gbp across the autosomes and the sex chromosomes. Have I missed something obvious here?

build37 • 2.5k views
ADD COMMENT
3
Entering edit mode

Are you including the big stretches of Ns in you count of 3.1 Gbp? That could lead to an ~250Mbp difference.

ADD REPLY
0
Entering edit mode

For reference, if I ignore Ns, I count ~2.87Gbp. That's including some unplaced scaffolds that might not be present in the build that 1000 genomes uses.

ADD REPLY
9
Entering edit mode
9.2 years ago
toni ★ 2.2k

I have computed this recently on GRCh37 :

Human wholegenome mappability

(Mappability was computed with GEM program for reads of length 100 and 5 mismatches authorised)

EDIT : And so dividing by 2.85Gb gives you a more realistic estimate of your mean coverage since N's will never be covered by definition.

ADD COMMENT
1
Entering edit mode

Awesome Venn diagram and explanation of the green mappable bases. Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 2414 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6