Question: How to Turn Multiple ALT alleles to Genotype Calls
gravatar for jimkozubek
12 months ago by
jimkozubek20 wrote:

I have a VCF file and many lines have multiple ALT alleles such as:

1       104159  .       CA      C,TA,TTT

I have an algorithm that takes genotype data in a 0,1,2 matrix form. I am wondering if there are any standards or best practices in how to turn VCF lines with multiple ALT alleles (0/0,0/1,0/2,1/2,2/3) into 0,1,2 genotype form.

genotype • 532 views
ADD COMMENTlink modified 12 months ago • written 12 months ago by jimkozubek20

Thanks for the pro tip!

ADD REPLYlink written 12 months ago by jimkozubek20
gravatar for Kevin Blighe
12 months ago by
Kevin Blighe45k
Kevin Blighe45k wrote:

Yes, you can split multi-allelic calls with

bcftools norm -Ov -m-any MyVariants.vcf > MyVariantsSplit.vcf ;

A useful addition is to also set the variants such that the REF allele matches that of a chosen reference genome, for example:

bcftools norm -Ov -m-any MyVariants.vcf | bcftools norm -Ov -f human_g1k_v37.fasta > MyVariantsSplitRefChecked.vcf ;


ADD COMMENTlink written 12 months ago by Kevin Blighe45k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1394 users visited in the last hour