how to determine the percentage of the coverage
0
0
Entering edit mode
19 months ago
Bioinfo ▴ 20

Hello Biostar I hope you're doing well i mapped my reads against the reference genome to get an idea about the coverage , what i want to know is the percentage of the coverage , but i don't know how to obtain this information using the diffrent tools i know , in qualimap , i just found the mean coverage and std coverage data also lines like this

There is a 99.59% of reference with a coverageData >= 1X
There is a 99.56% of reference with a coverageData >= 2X
There is a 99.53% of reference with a coverageData >= 3X
There is a 99.49% of reference with a coverageData >= 4X
There is a 99.45% of reference with a coverageData >= 5X
There is a 99.42% of reference with a coverageData >= 6X
There is a 99.39% of reference with a coverageData >= 7X
There is a 99.37% of reference with a coverageData >= 8X
There is a 99.34% of reference with a coverageData >= 9X
There is a 99.31% of reference with a coverageData >= 10X
There is a 99.3% of reference with a coverageData >= 11X
There is a 99.27% of reference with a coverageData >= 12X


..

i don't know if it s possible to obtain the percentage of the coverage using this informations and how , Also i used bamtools stats , and obtained these informations

Total reads:       6158542
Forward strand:    3286294  (53.3616%)
Reverse strand:    2872248  (46.6384%)
Failed QC:         0    (0%)
Duplicates:        0    (0%)
'Proper-pairs':    4667096  (75.7825%)
Both pairs mapped: 5730384  (93.0477%)
Singletons:        18833    (0.305803%)


and the same thing i didn't get the percentage of coverage and the same thing with samtools flagstat

6158542 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
5749217 + 0 mapped (93.35% : N/A)
6158542 + 0 paired in sequencing
4667096 + 0 properly paired (75.78% : N/A)
5730384 + 0 with itself and mate mapped
18833 + 0 singletons (0.31% : N/A)
83794 + 0 with mate mapped to a different chr
12721 + 0 with mate mapped to a different chr (mapQ>=


5)

and samtools idxstatsNZ_

ALWU01000001.1  581415  2031429 6451
NZ_ALWU01000002.1   43553   76489   678
NZ_ALWU01000003.1   29286   117672  342
NZ_ALWU01000004.1   37537   144448  440
NZ_ALWU01000005.1   217837  789901  2467
NZ_ALWU01000006.1   38235   103338  325
NZ_ALWU01000007.1   9944    45471   292
NZ_ALWU01000008.1   178611  651422  2190
NZ_ALWU01000009.1   17047   6352    42
NZ_ALWU01000010.1   510276  1782695 5606
*   0   0   390492


can you tell me please how to get the percentage of the coverage ? Thank you very much

sequencing alignment assembly • 952 views
0
Entering edit mode

Please use the search function and google. Dozens of threads related to this. You can almost be certain that at least on Biostars thread exists for every routine analysis question ;-)

0
Entering edit mode

Qualimap is also a tool which does that.

0
Entering edit mode

i used qualimap and i obtained the informations mentioned above in genome_results.txt , please tell me where i can find the percentage of the average ?

0
Entering edit mode

Thank you for your reply i will check the link , just because i looked for the different tools and i tried them but i didn't obtain the information i look for , that's why i asked the question

0
Entering edit mode

can you tell me please how to get the percentage of the coverage ?

Percentage at what level? Base, chromosome or genome? You will need to calculate that yourself using information you have already found with tools you listed above.

0
Entering edit mode

percentage at genome, i m little bit confused , , can you tell me how can i do that ( use the informations above to calcule the percentage of coverage ) and sorry for asking too many questions

1
Entering edit mode

percentage at Base level

Have you given this some thought? Is that a valid thing to be looking for?

If you have 10 reads covering a particular base would you define coverage as 1000% or would you rather stick with 10x (fold) number?

0
Entering edit mode

I'm sorry i mean in genome level because i want to know how much my reads covered the reference genome

0
Entering edit mode

Then please read through: Tools To Calculate Average Coverage For A Bam File?

samtools depth and mosdepth would be good to start with (there are plenty of other suggestions there). Take a look at in-line help for these tools since they can provide a summary at different levesl.

% coverage at genome level needs to be looked at in two ways:

1. Theoretical coverage - If your genome consists of X bases and you sequenced a total of Y bases then you have Y/X coverage (convert to % if you must)
2. Coverage determined by one of the tools above - It may give you a number at what ever level but keep in mind that coverage is not going to be uniform. Some areas may be covered more than others.