Splitting info field in Annovar's multianno.txt file
0
0
Entering edit mode
3.2 years ago

Having some trouble splitting my gnomAD database info field from the vcf info field in my ANNOVAR multianno.txt file. I had to use bcftools to merge the database annotation into the annovar input vcf to avoid the problem of annovar only outputting frequency data.

Here are some examples of the entries in the column I'm having trouble with. Columns are tab separated, so I am trying to essentially insert a tab at specific points in these entries.


CONTQ=93;DP=555;ECNT=4;MBQ=30,20,30;MFRL=181,172,212;MMQ=60,60,60;MPOS=6,21;OCM=0;POPAF=2.4,2.4;SEQQ=93;STRANDQ=93;TLOD=19.94,1805.91;qual=-10;filters=artifact_prone_site;*(etc etc etc etc)*

CONTQ=93;DP=801;ECNT=5;MBQ=30,10;MFRL=190,230;MMQ=60,60;MPOS=18;OCM=0;POPAF=2.4;SEQQ=2;STRANDQ=1;TLOD=3.34;qual=-10;filters=npg;*(etc etc etc)*

CONTQ=93;DP=812;ECNT=5;MBQ=30,20;MFRL=191,310;MMQ=60,60;MPOS=13;OCM=0;POPAF=2.4;SEQQ=1;STRANDQ=1;TLOD=0.024

Everything to the right of the TLOD= entry is gnomAD data. As you can see, sometimes there is no gnomAD entry, and sometimes TLOD= has multiple values, so I'm struggling to craft an effective regex in sed/awk.

Is there a simple programmatic way to do this? Or better yet, is there a way to get bcftools to put the gnomad data in its own info column before it goes through annovar?

This is my bcftools input:

bcftools annotate --force -a ./db.vcf.gz -c INFO ./input.vcf.gz > ./output.vcf
Annovar sed awk bcftools • 1.0k views
ADD COMMENT
0
Entering edit mode

You could try to standardize the info fields in your VCF file before annotating it with Annovar. Maybe something like

bcftools query -f '%CHROM\t%POS\t%ID\t%REF\t%ALT\t%QUAL\t%FILTER\tCONTQ=%CONTQ;DP=%DP;ECNT=%ECNT;MBQ=%MBQ;MFRL=%MFRL;MMQ=%MMQ;MPOS=%MPOS;OCM=%OCM;POPAF=%POPAF;SEQQ=%SEQQ;STRANDQ=%STRANDQ;TLOD=%TLOD;qual=%qual;filters=%filters;\n' input.vcf >> output.vcf

ADD REPLY

Login before adding your answer.

Traffic: 1993 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6