Question: Count The Coverage Rate Of A Bacterial Genome'S Bam File
0
gravatar for lanyuhao1991
5.6 years ago by
lanyuhao199110
China
lanyuhao199110 wrote:

I have the bam file. I want to count the coverage rate of a bacterial genome. For example, the size of a bacterial genome is 10000 bp. Reads have been mapped to the genome. I use 'samtool depth' and I get 6000 Nucleotide site have been coveraged. So the coverage rate is 60%. But I find some same reads is mapped to different site of the genome, so 60% is high. How can I handle these reads that are mapped to different site of the genome?

genome bam coverage map • 2.3k views
ADD COMMENTlink modified 5.6 years ago by Fidel1.9k • written 5.6 years ago by lanyuhao199110
2
gravatar for Fidel
5.6 years ago by
Fidel1.9k
Germany
Fidel1.9k wrote:

How did yo map the reads? The method to remove multi-reads is specific for the aligner used. You may want to check this answer: A: Best way to get truly unique reads in bowtie/SAM?

ADD COMMENTlink written 5.6 years ago by Fidel1.9k
0
gravatar for cnluzon
5.6 years ago by
cnluzon30
Spain
cnluzon30 wrote:

I guess it depends on what you are trying to do with these alignments. If it is only an estimation of the coverage you are getting, the first thing I would do would be counting the number of reads that are actually mapped to multiple places. This can be easily done looking at the reads' IDs.

If the percentage of reads mapped to several places is low, then I guess you should not worry too much. However, if it is high the question that rises is how does that affect your results? What I mean is that if it is very important that these reads are properly mapped (I do not know what you want to do, if it is for instance variants what you are trying to identify, this might be an issue) then maybe you should not count on these "ambiguous" reads and compute the coverage counting only the well mapped reads. This would yield a more reliable coverage value. If you are going to make some experiment decisions based on this and it is possible for you, I would try to reach the needed coverage without counting on the "ambiguous" reads.

Otherwise (if this is not a really big issue) you could try to see if your mapper has an "map only to one place" option or "remove ambiguous reads", which they sometimes have.

Hope this helped.

ADD COMMENTlink written 5.6 years ago by cnluzon30
0
gravatar for Jorge Amigo
5.6 years ago by
Jorge Amigo11k
Santiago de Compostela, Spain
Jorge Amigo11k wrote:

reads that map to several positions on the genome are usually assigned a mapping quality value of 0, therefore you could try removing those reads while calculating the coverage depth by considering a minimum mapping quality of 1:

samtools depth -Q 1 your_bam_file > final_coverage.txt
ADD COMMENTlink modified 5.6 years ago • written 5.6 years ago by Jorge Amigo11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1440 users visited in the last hour