Question: When should you left-align INDELs (and why?)
0
gravatar for QVINTVS_FABIVS_MAXIMVS
3 months ago by
USA SoCal
QVINTVS_FABIVS_MAXIMVS2.1k wrote:

Say I have two VCFs with 100 samples in each file. Each VCF was joint-called separately and now I want to merge the variant calls.

Do I need to left-align the INDELs in the merged VCF? I've used bcftools norm in the past and got odd results. It seems that vt is a better tool for this.

Is left-aligning only useful for common variants? If I'm interested in rare variants (<0.5% AF) would left-alignment actually matter?

Thanks

Here's an example of bcftools norm

Original VCF

chr7    157009949       .       AGCGGCGGCGGCG   AGCGGCGGCGGCGGCGGCG,A,AGCGGCGGCGGCGGCGGCGGCG,AGCGGCGGCG,AGCGGCGGCGGCGGCGGCGGCGGCG,AGCGGCGGCGGCGGCG

Left-Aligned VCF (with multiallelics split into biallelic calls)

chr7    157009949       .       A       AGCGGCGGCG  
chr7    157009949       .       A       AGCGGCGGCGGCG
chr7    157009949       .       A       AGCGGCG
chr7    157009949       .       A       AGCG
chr7    157009949       .       A       AGCGGCGGCGGCGGCG   
chr7    157009949       .       AGCG    A
chr7    157009949       .       AGCGGCG A
chr7    157009949       .       AGCGGCGGCGGCG   A

left align vcf indel • 275 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by QVINTVS_FABIVS_MAXIMVS2.1k
1

Hello,

could you please give an example of an "odd result" of bcftools norm?

fin swimmer

ADD REPLYlink written 3 months ago by finswimmer5.4k

Edited the main query above.

ADD REPLYlink written 3 months ago by QVINTVS_FABIVS_MAXIMVS2.1k
5
gravatar for chrchang523
3 months ago by
chrchang5233.9k
United States
chrchang5233.9k wrote:

Suppose one of your VCF files has a non-left-aligned insertion, represented as REF=AG, ALT=AGT, starting at position 99999, and another file has an insertion represented as REF=G, ALT=GT, starting at position 100000. If you don't left-align, these may not be recognized as the same variant, and downstream analysis will suffer.

ADD COMMENTlink written 3 months ago by chrchang5233.9k

Thanks for the clear answer. Do you recommend vt or bcftools for normalization?

ADD REPLYlink written 3 months ago by QVINTVS_FABIVS_MAXIMVS2.1k

Either will work (as long as you aren't using a very old bcftools version). The latest bcftools should be faster, especially if compiled with "libdeflate".

ADD REPLYlink written 3 months ago by chrchang5233.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1011 users visited in the last hour