Set ancestral alleles to upper case in vcf file
2
0
Entering edit mode
3.7 years ago
spiral01 ▴ 100

I am trying to set my reference allele as the ancestral allele in 1000genomes vcf files. I can do this using the --derived option in vcftools. However most of the ancestral alleles are in lowercase so vcftools is not able to correct for this.

I am currently looking at a method of extracting the ancestral alleles and converting them to upper case as such:

bcftools view -G -H file.vcf.gz | awk -F'[;=|]' '{for(i=1;i<=NF;i++)if($i=="AA"){print toupper($(i+1));next}}'

And then reinserting them.

This is quite a convoluted way of doing things and I wonder if anyone has a tidier method for doing this?

EDIT:

Here is a single entry from the vcf file (with genotype info hidden):

11  128196  rs576393503 A   G   100 PASS    AC=453;AF=0.0904553;AN=5008;NS=2504;DP=5057;EAS_AF=0.0159;AMR_AF=0.0259;AFR_AF=0.3071;EUR_AF=0.006;SAS_AF=0.0072;AA=g|||;VT=SNP

So here the ancestral allele is g (AA=g) and I need it to be in uppercase so that vcftools recognises it when running the --derived option.

SNP • 1.9k views
ADD COMMENT
0
Entering edit mode

I don't get what is this "AA". Show us one line of this vcf please.

ADD REPLY
0
Entering edit mode

I have edited my question. Thanks.

ADD REPLY
2
Entering edit mode
3.7 years ago

using vcffilterjdk http://lindenb.github.io/jvarkit/VcfFilterJdk.html

java -jar dist/vcffilterjdk.jar -e 'if(!variant.hasAttribute("AA")) return variant; String AA= variant.getAttributeAsString("AA",""); int pipe=AA.indexOf("|"); AA= AA.substring(0,pipe).toUpperCase()+AA.substring(pipe); return new VariantContextBuilder(variant).attribute("AA",AA).make();'
ADD COMMENT
0
Entering edit mode
5 weeks ago
qing • 0

Hi spiral01: I want to identify the ancestral allele from my vcf file, but I don't know how to do it. How did you do it?

ADD COMMENT

Login before adding your answer.

Traffic: 2530 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6