how to determine the percentage of the coverage
0
0
Entering edit mode
4.0 years ago
Bioinfo ▴ 20

Hello Biostar I hope you're doing well i mapped my reads against the reference genome to get an idea about the coverage , what i want to know is the percentage of the coverage , but i don't know how to obtain this information using the diffrent tools i know , in qualimap , i just found the mean coverage and std coverage data also lines like this

There is a 99.59% of reference with a coverageData >= 1X
     There is a 99.56% of reference with a coverageData >= 2X
     There is a 99.53% of reference with a coverageData >= 3X
     There is a 99.49% of reference with a coverageData >= 4X
     There is a 99.45% of reference with a coverageData >= 5X
     There is a 99.42% of reference with a coverageData >= 6X
     There is a 99.39% of reference with a coverageData >= 7X
     There is a 99.37% of reference with a coverageData >= 8X
     There is a 99.34% of reference with a coverageData >= 9X
     There is a 99.31% of reference with a coverageData >= 10X
     There is a 99.3% of reference with a coverageData >= 11X
     There is a 99.27% of reference with a coverageData >= 12X

..

i don't know if it s possible to obtain the percentage of the coverage using this informations and how , Also i used bamtools stats , and obtained these informations

Total reads:       6158542
Mapped reads:      5749217  (93.3535%)
Forward strand:    3286294  (53.3616%)
Reverse strand:    2872248  (46.6384%)
Failed QC:         0    (0%)
Duplicates:        0    (0%)
Paired-end reads:  6158542  (100%)
'Proper-pairs':    4667096  (75.7825%)
Both pairs mapped: 5730384  (93.0477%)
Read 1:            3079271
Read 2:            3079271
Singletons:        18833    (0.305803%)

and the same thing i didn't get the percentage of coverage and the same thing with samtools flagstat

6158542 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
5749217 + 0 mapped (93.35% : N/A)
6158542 + 0 paired in sequencing
3079271 + 0 read1
3079271 + 0 read2
4667096 + 0 properly paired (75.78% : N/A)
5730384 + 0 with itself and mate mapped
18833 + 0 singletons (0.31% : N/A)
83794 + 0 with mate mapped to a different chr
12721 + 0 with mate mapped to a different chr (mapQ>=

5)

and samtools idxstatsNZ_

ALWU01000001.1  581415  2031429 6451
    NZ_ALWU01000002.1   43553   76489   678
    NZ_ALWU01000003.1   29286   117672  342
    NZ_ALWU01000004.1   37537   144448  440
    NZ_ALWU01000005.1   217837  789901  2467
    NZ_ALWU01000006.1   38235   103338  325
    NZ_ALWU01000007.1   9944    45471   292
    NZ_ALWU01000008.1   178611  651422  2190
    NZ_ALWU01000009.1   17047   6352    42
    NZ_ALWU01000010.1   510276  1782695 5606
    *   0   0   390492

can you tell me please how to get the percentage of the coverage ? Thank you very much

sequencing alignment assembly • 2.5k views
ADD COMMENT
0
Entering edit mode

Tools To Calculate Average Coverage For A Bam File?

Please use the search function and google. Dozens of threads related to this. You can almost be certain that at least on Biostars thread exists for every routine analysis question ;-)

ADD REPLY
0
Entering edit mode

Qualimap is also a tool which does that.

ADD REPLY
0
Entering edit mode

i used qualimap and i obtained the informations mentioned above in genome_results.txt , please tell me where i can find the percentage of the average ?

ADD REPLY
0
Entering edit mode

Thank you for your reply i will check the link , just because i looked for the different tools and i tried them but i didn't obtain the information i look for , that's why i asked the question

ADD REPLY
0
Entering edit mode

can you tell me please how to get the percentage of the coverage ?

Percentage at what level? Base, chromosome or genome? You will need to calculate that yourself using information you have already found with tools you listed above.

ADD REPLY
0
Entering edit mode

percentage at genome, i m little bit confused , , can you tell me how can i do that ( use the informations above to calcule the percentage of coverage ) and sorry for asking too many questions

ADD REPLY
1
Entering edit mode

percentage at Base level

Have you given this some thought? Is that a valid thing to be looking for?

If you have 10 reads covering a particular base would you define coverage as 1000% or would you rather stick with 10x (fold) number?

ADD REPLY
0
Entering edit mode

I'm sorry i mean in genome level because i want to know how much my reads covered the reference genome

ADD REPLY
0
Entering edit mode

Then please read through: Tools To Calculate Average Coverage For A Bam File?

samtools depth and mosdepth would be good to start with (there are plenty of other suggestions there). Take a look at in-line help for these tools since they can provide a summary at different levesl.

% coverage at genome level needs to be looked at in two ways:

  1. Theoretical coverage - If your genome consists of X bases and you sequenced a total of Y bases then you have Y/X coverage (convert to % if you must)
  2. Coverage determined by one of the tools above - It may give you a number at what ever level but keep in mind that coverage is not going to be uniform. Some areas may be covered more than others.
ADD REPLY

Login before adding your answer.

Traffic: 2483 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6