Question: how to include insertion and deletion in the consensus sequence from BWA reference mapping/assembly
gravatar for liaoyunshi
2.7 years ago by
liaoyunshi0 wrote:

Many thanks for reading my post.

When I finish NGS, I use BWA to conduct reference mapping/assembly to align short reads into the reference genome, and then use samtools and bcftools to get the consensus sequence. Then I found the length of consensus sequence is the same as the reference, with no gaps. But actually, there should be gaps between the sequences. And it seems BWA and subsequent tools cannot incorporate indels when generating consensus sequence.

For the insertion in reads, it seems BWA cannot insert gaps into reference sequence which leads to the exclusion of those insertion information in the consensus sequence.

For the deletion in reads, it seems BWA do not treat gaps as character when counting the consensus character for one site.

Thus I just want to know if there is any reference mapping/assembly tools can incorporate the indels between the reads and the reference sequence when generating consensus sequence. If not, it seems the resultant sequence would be a "wrong" one. Right?

If I have any mistakes in my understanding of BWA or others, please feel free to correct me. I am a newbie in NGS, your help would be important to me. Thanks.

ADD COMMENTlink modified 2.2 years ago by Biostar ♦♦ 20 • written 2.7 years ago by liaoyunshi0
gravatar for colindaven
2.7 years ago by
Hannover Medical School
colindaven2.1k wrote:

You are taking an alignment approach rather than an assembly approach.

If you want a true consensus what you would be better placed to do is refer to a de novo assembler. BWA et al will find short indels, but a de novo approach (especially with long reads and short Illumina reads to improve the base quality) is much more likely to give you what you want.

I would suggest reading up on de novo assembly on this forum and the literature if I understand you correctly.

ADD COMMENTlink written 2.7 years ago by colindaven2.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1195 users visited in the last hour