Question: How to Turn Multiple ALT alleles to Genotype Calls
gravatar for jimkozubek
2.4 years ago by
jimkozubek30 wrote:

I have a VCF file and many lines have multiple ALT alleles such as:

1       104159  .       CA      C,TA,TTT

I have an algorithm that takes genotype data in a 0,1,2 matrix form. I am wondering if there are any standards or best practices in how to turn VCF lines with multiple ALT alleles (0/0,0/1,0/2,1/2,2/3) into 0,1,2 genotype form.

genotype • 1.3k views
ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by jimkozubek30

Thanks for the pro tip!

ADD REPLYlink written 2.4 years ago by jimkozubek30

did not work xd Can you share the solution if you have?

If I use the command below to divide them into biallelic, it does not work correctly.

Another solution is maybe to get rid of them but I do not lean towards.. Would be appreciated if you share the solution. Thanks!

ADD REPLYlink modified 9 months ago • written 9 months ago by cannilay20
gravatar for Kevin Blighe
2.4 years ago by
Kevin Blighe67k
Republic of Ireland
Kevin Blighe67k wrote:

Yes, you can split multi-allelic calls with

bcftools norm -Ov -m-any MyVariants.vcf > MyVariantsSplit.vcf ;

A useful addition is to also set the variants such that the REF allele matches that of a chosen reference genome, for example:

bcftools norm -Ov -m-any MyVariants.vcf | bcftools norm -Ov -f human_g1k_v37.fasta > MyVariantsSplitRefChecked.vcf ;


ADD COMMENTlink written 2.4 years ago by Kevin Blighe67k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1671 users visited in the last hour