Question: How to Turn Multiple ALT alleles to Genotype Calls
gravatar for jimkozubek
8 months ago by
jimkozubek20 wrote:

I have a VCF file and many lines have multiple ALT alleles such as:

1       104159  .       CA      C,TA,TTT

I have an algorithm that takes genotype data in a 0,1,2 matrix form. I am wondering if there are any standards or best practices in how to turn VCF lines with multiple ALT alleles (0/0,0/1,0/2,1/2,2/3) into 0,1,2 genotype form.

genotype • 356 views
ADD COMMENTlink modified 8 months ago • written 8 months ago by jimkozubek20

Thanks for the pro tip!

ADD REPLYlink written 8 months ago by jimkozubek20
gravatar for Kevin Blighe
8 months ago by
Kevin Blighe39k
Republic of Ireland
Kevin Blighe39k wrote:

Yes, you can split multi-allelic calls with

bcftools norm -Ov -m-any MyVariants.vcf > MyVariantsSplit.vcf ;

A useful addition is to also set the variants such that the REF allele matches that of a chosen reference genome, for example:

bcftools norm -Ov -m-any MyVariants.vcf | bcftools norm -Ov -f human_g1k_v37.fasta > MyVariantsSplitRefChecked.vcf ;


ADD COMMENTlink written 8 months ago by Kevin Blighe39k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1803 users visited in the last hour