Does Information About Coverage Covers Diploid Or Haploid Size Of Genome?
1
0
Entering edit mode
10.9 years ago

I would like to know if coverage 0.05 means that 5 % of haploid genome is covered by sequencing? I would say so, but what about dioecious species? Or is the information always linked to diploid genomes?

Thank you very much.

(I am doing comparative study and I would like to extract such proportion of the sequencing reads so that it would match the same coverage for both species.)

coverage sequencing • 3.9k views
ADD COMMENT
1
Entering edit mode
10.9 years ago
Irsan ★ 7.8k

For me, your question is confusing because the reference sequences used for genome re-sequencing analyses are represented as haploid genomes (there is only one sequence of bases for each chromosome). Furthermore, you have to understand that coverage can only be calculated after mapping to the genome so you can only "extract such proportion of the sequencing reads so that it would match the same coverage for both species" after mapping. Although I am still confused about your question I hope the following is informative for you.

Coverage numbers are irrespective of ploidy of the genome. Coverage 0.05 means that on average a base in the used reference genome (usually represented is haploid) is covered by 0.05 reads (something like every 20th base in the genome is covered by 1 time). If you want two alignment to have the same coverage on their reference genome, then dilute amount of reads pairs in the alignment result you have (.bam-files?) randomly until your desired coverage is used.

ADD COMMENT
0
Entering edit mode

Well, let's say my first genome has estimated size of haploid genome to be 2,000Mbp. The second is estimated to be 1,000 Mbp. I have sequencing reads for both genomes and I would like to do comparative study. I estimate amount of Mbp I have sequenced as read_lengthXnumber_of_reads. If this way I achieve let's say 1,000Mbp for the first genome and I say that my coverage is 0.5 (1,000/2,000 Mbp) - isn't that right? I count on the fact that I cannot compare two species, when the data for first have coverage 0.5x and data for second have coverage 1.8x - therefore I want to extract just proportion of reads. But I am confused with haploid/diploid genome size, especially in dioecious species, where female and males have different genome sizes. I don't do any mapping and therefore I don't have any bam files to analyze. Hope I clarified my question a little bit more now :-)

ADD REPLY

Login before adding your answer.

Traffic: 1963 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6