Randomly select one variant from heterozygous sites in VCF file
0
1
Entering edit mode
5.3 years ago
Dave Carlson ★ 1.7k

Hi Biostars,

Like the title suggests, I have a VCF file and I would like to take every heterozygous position in it and randomly select one of the two alleles to save to a new file. Is there an existing tool that will do this? So far in my searching, I have not found one. Or would I be better off writing my own script to do this?

Thanks for any suggestions!

vcf snp • 1.7k views
ADD COMMENT
0
Entering edit mode

Hello,

could you please provide an example of your desired output?

fin swimmer

ADD REPLY
0
Entering edit mode

Hi fin swimmer, sorry for the delayed reply. After seeing your question, I realized I wasn't sure what exactly the best option for the output would be. I think maybe turning the heterozygous variant into a homozygous site (either ref or variant) would be good.

So in other words,I would like to take each heterozygous site in my VCF:

20     14370   rs6054257 G      A       29   PASS   NS=3;DP=14;AF=0.5;DB;H2           GT 0/1

And then randomly set the genotype to be homozygous reference or homozygous variant. The result would be either:

20     14370   rs6054257 G      A       29   PASS   NS=3;DP=14;AF=0.5;DB;H2           GT 1/1

Or:

20     14370   rs6054257 G      A       29   PASS   NS=3;DP=14;AF=0.5;DB;H2           GT 0/0

Does that make sense? Thanks! Dave

ADD REPLY
0
Entering edit mode

If this makes sense must you know. You should tell us more about why you want to do this.

bcftools provide a plugin called setGT which can set the genotypes based on criteria. But it not randomly set it to hom ref or hom alt. What should happen to your already existing homozygous variants? Should they be included in the new output file?

ADD REPLY

Login before adding your answer.

Traffic: 2227 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6