Question: Set ancestral alleles to upper case in vcf file
gravatar for spiral01
2.9 years ago by
spiral01100 wrote:

I am trying to set my reference allele as the ancestral allele in 1000genomes vcf files. I can do this using the --derived option in vcftools. However most of the ancestral alleles are in lowercase so vcftools is not able to correct for this.

I am currently looking at a method of extracting the ancestral alleles and converting them to upper case as such:

bcftools view -G -H file.vcf.gz | awk -F'[;=|]' '{for(i=1;i<=NF;i++)if($i=="AA"){print toupper($(i+1));next}}'

And then reinserting them.

This is quite a convoluted way of doing things and I wonder if anyone has a tidier method for doing this?


Here is a single entry from the vcf file (with genotype info hidden):

11  128196  rs576393503 A   G   100 PASS    AC=453;AF=0.0904553;AN=5008;NS=2504;DP=5057;EAS_AF=0.0159;AMR_AF=0.0259;AFR_AF=0.3071;EUR_AF=0.006;SAS_AF=0.0072;AA=g|||;VT=SNP

So here the ancestral allele is g (AA=g) and I need it to be in uppercase so that vcftools recognises it when running the --derived option.

snp • 1.4k views
ADD COMMENTlink modified 2.9 years ago by Pierre Lindenbaum134k • written 2.9 years ago by spiral01100

I don't get what is this "AA". Show us one line of this vcf please.

ADD REPLYlink written 2.9 years ago by Pierre Lindenbaum134k

I have edited my question. Thanks.

ADD REPLYlink written 2.9 years ago by spiral01100
gravatar for Pierre Lindenbaum
2.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum134k wrote:

using vcffilterjdk

java -jar dist/vcffilterjdk.jar -e 'if(!variant.hasAttribute("AA")) return variant; String AA= variant.getAttributeAsString("AA",""); int pipe=AA.indexOf("|"); AA= AA.substring(0,pipe).toUpperCase()+AA.substring(pipe); return new VariantContextBuilder(variant).attribute("AA",AA).make();'
ADD COMMENTlink written 2.9 years ago by Pierre Lindenbaum134k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2246 users visited in the last hour