Question: Multiple alleles in REF and ALT in VCF file
0
gravatar for Volka
6 weeks ago by
Volka120
Volka120 wrote:

Hi all, I am looking at some data in my VCF file and came across this line below:

20      62855516        20:62855516:GC:AC       GC      AC      .       .       PR;AC=8;AN=70   GT      0/0     0/0     0/0     0/0     0/1     0/1     0/1   0/0      0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/1     0/0     0/0     0/0     0/0     0/1     0/1     0/1     0/0     0/0   0/0      0/1     0/0     0/0     0/0     0/0     0/0     0/0

My question is, is this considered a multiallelic site? How should I handle this entry? I am also looking to compare sites with another VCF, and the equivalent position in the other VCF has G in REF and A in ALT, is there a way to clean the data to consider only the first allele for this entry?

I've tried to remove/fix these entries with bcftools view -m2 -M2 -v snps and bcftools norm -m -any but it doesn't seem to catch it.

Thanks.

vcftools bcftools allele indel vcf • 137 views
ADD COMMENTlink modified 4 days ago by Biostar ♦♦ 20 • written 6 weeks ago by Volka120

Instead of removing, try decomposing vcfs with Vt. Refer to the decompose biallelic block substitutions section here: https://genome.sph.umich.edu/wiki/Vt

ADD REPLYlink modified 4 days ago • written 4 days ago by cpad011215k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1910 users visited in the last hour
_