Updating allele frequency (AF) and minor allele frequency (MAF) INFO fields in .vcf
3
0
Entering edit mode
6.8 years ago
JSEM ▴ 30

Hi everyone,

I'm processing .vcf files for the first time, and am hoping for some advice with something. I recently filtered a set of .VCFs for a dataset to exclude around half of the subjects, but now want to update the INFO columns in me output files to reflect the stats of the new filtered dataset.

I've tried using fill-an-ac from vcftools, which has successfully updated the AN/AC fields, but this doesn't updated the allele frequency/minor allele frequency fields. I know that I could use the AN/AC values to derive AF/MAF, but I was just wondering if there is an option in one of the existing tools to update these fields automatically, similar to the function of fill-an-ac?

I'd really appreciate any suggestions you might have.

Thanks in advance!

vcf vcftools imputation QC SNP • 4.6k views
ADD COMMENT
1
Entering edit mode
6.8 years ago

using vcffilterjdk: http://lindenb.github.io/jvarkit/VcfFilterJdk.html

use awk to insert the VCF headers AF and MAF if they're missing.

in vcffilterjdk: collect AN and AC for each alt allele. Map each AC to AF by dividing AC by AN. set the Attribute "AF", get the min (AF) and the set attribute "MAF".

$ gunzip -c  src/test/resources/rotavirus_rf.vcf.gz |\
awk '/^#CHROM/ {printf("##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Min Allele Frequency\">\n##INFO=<ID=AF,Number=A,Type=Float,Description=\"Allele Frequency\">\n");} {print}' |\
java -jar dist/vcffilterjdk.jar -e 'VariantContextBuilder vcb = new VariantContextBuilder(variant); float ac = variant.getAttributeAsInt("AN",0); if(ac>0) { List<Float> af = variant.getAttributeAsIntList("AC",0).stream().map(N->N/ac).collect(Collectors.toList());vcb.attribute("AF",af);vcb.attribute("MAF",af.stream().mapToDouble(X->X.floatValue()).min().orElse(-1.0) );} return vcb.make();'
ADD COMMENT
0
Entering edit mode

Thanks for the quick reply. This approach seems like it should be quite straightforward, difficulty is that I'm running this on an HPC cluster, where running java programs is not very straightforward given permissions...etc and modules available on HPC.

Are there any other approaches that are non-java based, or workarounds for HPCs? Thanks again!

ADD REPLY
0
Entering edit mode

, where running java programs is not very straightforward given permissions.

why java and not python or whatever ? This program use streaming = doesn't require much memory.

ADD REPLY
0
Entering edit mode
5.8 years ago
GK1610 ▴ 120

when I run this , I get the following error

VCF=unzipped_vcf_file awk '/^#CHROM/ {printf("##INFO=<id=maf,number=1,type=float,description=\"min allele="" frequency\"="">\n##INFO=<id=af,number=a,type=float,description=\"allele frequency\"="">\n");} {print}' $VCF |\ java -jar /sc/orga/projects/psychencode/kiran/jvarkit/dist/vcffilterjdk.jar -e 'VariantContextBuilder vcb = new VariantContextBuilder(variant); float ac = variant.getAttributeAsInt("AN",0); if(ac>0) { List<float> af = variant.getAttributeAsIntList("AC",0).stream().map(N->N/ac).collect(Collectors.toList());vcb.attribute("AF",af);vcb.attribute("MAF",af.stream().mapToDouble(X->X.floatValue()).min().orElse(-1.0) );} return vcb.make();' > temp.vcf

[DEBUG][VcfFilterJdk] Compiling : 1 import java.util.; 2 import java.util.stream.; 3 import java.util.function.; 4 import htsjdk.samtools.util.; 5 import htsjdk.variant.variantcontext.; 6 import htsjdk.variant.vcf.; 7 @javax.annotation.Generated(value="VcfFilterJdk",date="2019-02-21T17:30:17-0500") 8 public class VcfFilterJdkCustom1164653960 extends com.github.lindenb.jvarkit.tools.vcffilterjs.VcfFilterJdk.AbstractFilter { 9 public VcfFilterJdkCustom1164653960(final VCFHeader header) { 10 super(header); 11 } 12 @Override 13 public Object apply(final VariantContext variant) { 14 /* user's code starts here */ 15 VariantContextBuilder vcb = new VariantContextBuilder(variant); float ac = variant.getAttributeAsInt("AN",0); if(ac>0) { List<float> af = variant.getAttributeAsIntList("AC",0).stream().map(N->N/ac).collect(Collectors.toList());vcb.attribute("AF",af);vcb.attribute("MAF",af.stream().mapToDouble(X->X.floatValue()).min().orElse(-1.0) );} return vcb.make(); 16 /* user's code ends here */ 17 } 18 } [WARN][SnpEffPredictionParser]no INFO[EFF] or no description. This VCF was probably NOT annotated with SnpEff (old version). But it's not a problem if this tool doesn't need to access SnpEff Annotations. [WARN][VepPredictionParser]NO INFO[CSQ] found in header. This VCF was probably NOT annotated with VEP. But it's not a problem if this tool doesn't need to access VEP Annotations. [WARN][AnnPredictionParser]no INFO[ANN] or no description This VCF was probably NOT annotated with SnpEff(ANN version) . But it's not a problem if this tool doesn't need to access SnpEff Annotations. [SEVERE][Launcher]Contig 22 does not have a length field. htsjdk.tribble.TribbleException: Contig 22 does not have a length field. at htsjdk.variant.vcf.VCFContigHeaderLine.getSAMSequenceRecord(VCFContigHeaderLine.java:80) at htsjdk.variant.vcf.VCFHeader.getSequenceDictionary(VCFHeader.java:206) at com.github.lindenb.jvarkit.util.picard.SAMSequenceDictionaryProgress.<init>(SAMSequenceDictionaryProgress.java:259) at com.github.lindenb.jvarkit.tools.vcffilterjs.VcfFilterJdk.doVcfToVcf(VcfFilterJdk.java:800) at com.github.lindenb.jvarkit.util.jcommander.Launcher.doVcfToVcf(Launcher.java:563) at com.github.lindenb.jvarkit.util.jcommander.Launcher.doVcfToVcf(Launcher.java:583) at com.github.lindenb.jvarkit.tools.vcffilterjs.VcfFilterJdk.doWork(VcfFilterJdk.java:826) at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMain(Launcher.java:736) at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMainWithExit(Launcher.java:894) at com.github.lindenb.jvarkit.tools.vcffilterjs.VcfFilterJdk.main(VcfFilterJdk.java:841) [INFO][Launcher]vcffilterjdk Exited with failure (-1)

ADD COMMENT
0
Entering edit mode

Contig 22 does not have a length field.

there is an error in your header ##contig=<ID=22...> ...

ADD REPLY
0
Entering edit mode
5.6 years ago
seta ★ 1.9k

hi all I have a problem similar JSEM, which AF doesn't updated , is there any other command.

thanks in advance

ADD COMMENT

Login before adding your answer.

Traffic: 1858 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6