Question: How to Turn Multiple ALT alleles to Genotype Calls
1
gravatar for jimkozubek
21 months ago by
jimkozubek30
jimkozubek30 wrote:

I have a VCF file and many lines have multiple ALT alleles such as:

1       104159  .       CA      C,TA,TTT

I have an algorithm that takes genotype data in a 0,1,2 matrix form. I am wondering if there are any standards or best practices in how to turn VCF lines with multiple ALT alleles (0/0,0/1,0/2,1/2,2/3) into 0,1,2 genotype form.

genotype • 894 views
ADD COMMENTlink modified 21 months ago • written 21 months ago by jimkozubek30
1

Thanks for the pro tip!

ADD REPLYlink written 21 months ago by jimkozubek30
1

did not work xd Can you share the solution if you have?

If I use the command below to divide them into biallelic, it does not work correctly.

Another solution is maybe to get rid of them but I do not lean towards.. Would be appreciated if you share the solution. Thanks!

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by cannilay20
1
gravatar for Kevin Blighe
21 months ago by
Kevin Blighe56k
Kevin Blighe56k wrote:

Yes, you can split multi-allelic calls with

bcftools norm -Ov -m-any MyVariants.vcf > MyVariantsSplit.vcf ;

A useful addition is to also set the variants such that the REF allele matches that of a chosen reference genome, for example:

bcftools norm -Ov -m-any MyVariants.vcf | bcftools norm -Ov -f human_g1k_v37.fasta > MyVariantsSplitRefChecked.vcf ;

Kevin

ADD COMMENTlink written 21 months ago by Kevin Blighe56k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1506 users visited in the last hour