Calculating coverage of genes which have accounted for multiple best hits from bam/sam files
1
0
Entering edit mode
6.7 years ago
cwwong13 ▴ 40

I am new to bioinformatics and currently working on a project need to calculate the coverage of each gene from a metagenomic data set. However, there are several questions cannot be solved after I went through many posts from this platform and other websites (including the manuals of samtools, bwa, bedtools)...

My goal is to:

  1. Map metagenomic sequencing data with bwa aln (it is based on one of the references' method) against a custom reference genome
  2. Calculate the coverage of each gene from the reference genome

However, for the second step, I expect that some of the reads will be mapped to multiple regions in the reference genome. Therefore, I would like to ask are there any method to get the coverage that is weighted by the best tied hits? In other words, if a read mapped to 2 regions in total (these two regions have same mapping score), they both contribute 50% when calculating coverage.

I found that there is a X0 flag in the sam output from bwa, and I previously use this to retrieve how many times do that sequence mapped. However, when I was using bedtools, it seems to discard all alternative hits that have the same score (it count only the first region showed in sam file in standard output).

I am not sure whether my work flow is alright, thus it would be great if you could provide some suggestions (any parameters or flags should be used during samtools, bedtools, bwa etc) on my current pipeline or suggest new tools that can accomplish the tasks. Thank you!

alignment next-gen bwa samtools bedtools • 1.8k views
ADD COMMENT
0
Entering edit mode
6.7 years ago
Michael 54k

That is caused BWA which outputs only one of multiple hits with equal scores. It uses the X0 tag to tell you the number of total hits, but I do not know a way to extract all matching positions. You have to use a different aligner, if you need this information in the sam/bam-output.

ADD COMMENT

Login before adding your answer.

Traffic: 2308 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6