how to include insertion and deletion in the consensus sequence from BWA reference mapping/assembly
1
0
Entering edit mode
6.7 years ago
liaoyunshi • 0

Many thanks for reading my post.

When I finish NGS, I use BWA to conduct reference mapping/assembly to align short reads into the reference genome, and then use samtools and bcftools to get the consensus sequence. Then I found the length of consensus sequence is the same as the reference, with no gaps. But actually, there should be gaps between the sequences. And it seems BWA and subsequent tools cannot incorporate indels when generating consensus sequence.

For the insertion in reads, it seems BWA cannot insert gaps into reference sequence which leads to the exclusion of those insertion information in the consensus sequence.

For the deletion in reads, it seems BWA do not treat gaps as character when counting the consensus character for one site.

Thus I just want to know if there is any reference mapping/assembly tools can incorporate the indels between the reads and the reference sequence when generating consensus sequence. If not, it seems the resultant sequence would be a "wrong" one. Right?

If I have any mistakes in my understanding of BWA or others, please feel free to correct me. I am a newbie in NGS, your help would be important to me. Thanks.

sequencing next-gen genome alignment Assembly • 2.4k views
ADD COMMENT
0
Entering edit mode
6.7 years ago

You are taking an alignment approach rather than an assembly approach.

If you want a true consensus what you would be better placed to do is refer to a de novo assembler. BWA et al will find short indels, but a de novo approach (especially with long reads and short Illumina reads to improve the base quality) is much more likely to give you what you want.

I would suggest reading up on de novo assembly on this forum and the literature if I understand you correctly.

ADD COMMENT

Login before adding your answer.

Traffic: 2067 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6