Hello Biostar I hope you're doing well i mapped my reads against the reference genome to get an idea about the coverage , what i want to know is the percentage of the coverage , but i don't know how to obtain this information using the diffrent tools i know , in qualimap , i just found the mean coverage and std coverage data also lines like this
There is a 99.59% of reference with a coverageData >= 1X
There is a 99.56% of reference with a coverageData >= 2X
There is a 99.53% of reference with a coverageData >= 3X
There is a 99.49% of reference with a coverageData >= 4X
There is a 99.45% of reference with a coverageData >= 5X
There is a 99.42% of reference with a coverageData >= 6X
There is a 99.39% of reference with a coverageData >= 7X
There is a 99.37% of reference with a coverageData >= 8X
There is a 99.34% of reference with a coverageData >= 9X
There is a 99.31% of reference with a coverageData >= 10X
There is a 99.3% of reference with a coverageData >= 11X
There is a 99.27% of reference with a coverageData >= 12X
..
i don't know if it s possible to obtain the percentage of the coverage using this informations and how , Also i used bamtools stats , and obtained these informations
Total reads: 6158542
Mapped reads: 5749217 (93.3535%)
Forward strand: 3286294 (53.3616%)
Reverse strand: 2872248 (46.6384%)
Failed QC: 0 (0%)
Duplicates: 0 (0%)
Paired-end reads: 6158542 (100%)
'Proper-pairs': 4667096 (75.7825%)
Both pairs mapped: 5730384 (93.0477%)
Read 1: 3079271
Read 2: 3079271
Singletons: 18833 (0.305803%)
and the same thing i didn't get the percentage of coverage and the same thing with samtools flagstat
6158542 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
5749217 + 0 mapped (93.35% : N/A)
6158542 + 0 paired in sequencing
3079271 + 0 read1
3079271 + 0 read2
4667096 + 0 properly paired (75.78% : N/A)
5730384 + 0 with itself and mate mapped
18833 + 0 singletons (0.31% : N/A)
83794 + 0 with mate mapped to a different chr
12721 + 0 with mate mapped to a different chr (mapQ>=
5)
and samtools idxstatsNZ_
ALWU01000001.1 581415 2031429 6451
NZ_ALWU01000002.1 43553 76489 678
NZ_ALWU01000003.1 29286 117672 342
NZ_ALWU01000004.1 37537 144448 440
NZ_ALWU01000005.1 217837 789901 2467
NZ_ALWU01000006.1 38235 103338 325
NZ_ALWU01000007.1 9944 45471 292
NZ_ALWU01000008.1 178611 651422 2190
NZ_ALWU01000009.1 17047 6352 42
NZ_ALWU01000010.1 510276 1782695 5606
* 0 0 390492
can you tell me please how to get the percentage of the coverage ? Thank you very much
Tools To Calculate Average Coverage For A Bam File?
Please use the search function and google. Dozens of threads related to this. You can almost be certain that at least on Biostars thread exists for every routine analysis question ;-)
Qualimap is also a tool which does that.
i used qualimap and i obtained the informations mentioned above in genome_results.txt , please tell me where i can find the percentage of the average ?
Thank you for your reply i will check the link , just because i looked for the different tools and i tried them but i didn't obtain the information i look for , that's why i asked the question
Percentage at what level? Base, chromosome or genome? You will need to calculate that yourself using information you have already found with tools you listed above.
percentage at genome, i m little bit confused , , can you tell me how can i do that ( use the informations above to calcule the percentage of coverage ) and sorry for asking too many questions
Have you given this some thought? Is that a valid thing to be looking for?
If you have 10 reads covering a particular base would you define coverage as 1000% or would you rather stick with 10x (fold) number?
I'm sorry i mean in genome level because i want to know how much my reads covered the reference genome
Then please read through: Tools To Calculate Average Coverage For A Bam File?
samtools depth
andmosdepth
would be good to start with (there are plenty of other suggestions there). Take a look at in-line help for these tools since they can provide a summary at different levesl.% coverage at genome level needs to be looked at in two ways: