Question: Script that converts VCF file to reference allele format?
1
gravatar for dktldkrh13
4.6 years ago by
dktldkrh1310
United States
dktldkrh1310 wrote:

Hi guys

I recently received a VCF file containing SNP information but it seems to have the allele information solely depending on population. (They basically placed more observed allele in my sample as REF) Therefore, the file has 'Major Allele' on REF and 'Minor Alleles' on ALT. Is there any shared script that can convert this file into correct REF/ALT format using reference genome?

 

 

snp vcf • 2.0k views
ADD COMMENTlink modified 4.6 years ago by RamRS21k • written 4.6 years ago by dktldkrh1310
1
gravatar for RamRS
4.6 years ago by
RamRS21k
Houston, TX
RamRS21k wrote:

Unless I'm mistaken, the REF part is chosen based on the reference sequence used at the alignment stage in the analysis. You might have to go back a couple of steps and re-align to the new reference sequence. This could, of course, change the pool of variants you're looking at right now.

I have not heard of changing references in VCF files. A hacky way of doing that could be picking all positions in the VCF file and finding corresponding bases in the new reference sequence, then maybe using a script to add a column to the VCF.

It's all plain text so a combination of BASH and Perl/Python should help you.

ADD COMMENTlink written 4.6 years ago by RamRS21k
1

+1 "REF part is chosen based on the reference sequence"

ADD REPLYlink written 4.6 years ago by Pierre Lindenbaum120k

My brain kinda goes bonkers after midnight, I guess. Did I phrase it weird? I do hope the phrasing did not introduce ambiguity to the fact underneath.

ADD REPLYlink written 4.6 years ago by RamRS21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1047 users visited in the last hour