how to seperate VEP INFO column into seperate columns
0
0
Entering edit mode
14 months ago
minoo ▴ 10

I have a vcf files like below:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  treatmentSample
chr1    857100  .   C   T   1756.06 PASS    AC=2;AF=1;AN=2;DP=60;ExcessHet=3.0103;FS=0;MLEAC=2;MLEAF=1;MQ=60;QD=29.27;SOR=1.812;CSQ=chr1:857100|T|SNV|ENSG00000228794|ENST00000445118|LINC01128||1|MODIFIER|non_coding_transcript_exon_variant||||5/5|||||||||||||||||| GT:AD:DP:GQ:PL  1/1:0,60:60:99:1770,180,0

Does anyone know how to seperate INFO columns into different columns? And also how to separate treatmentSample column following the FORMAT ORDER? I TRIED TO USE bcftools +split-vep and awk but they didn't work.

bcftools +split-vep treatmentSample.vcf.gz -f '\t%CHROM\t%POS\t%ID\t%REF\t%ALT\t%QUAL\t%FILTER\t%FORMAT\t%treatmentSample\t%INFO\t% AC\t%AF\t%AN\t% MLEAC\t% MLEAF\t% MQ\t% QD\t%SOR\t%CSQ[\t%GT][\t%GQ][\t%DP][\t%MIN_DP][\t%AD][\t%VAF][\t%PL][\t%MED_DP]\n' -d -A tab > output.vcf

The output was like below:

Warning: duplicate INFO/CSQ key "CADD_phred"
Note: ambiguous key %AC; using the AC subfield of CSQ, not the INFO/AC tag
Note: ambiguous key %AN; using the AN subfield of CSQ, not the INFO/AN tag
Note: ambiguous key %Diag_Germline_Gene; using the Diag_Germline_Gene subfield of CSQ, not the INFO/Diag_Germline_Gene tag
Could not parse format string: \t%CHROM\t%POS\t%ID\t%REF\t%ALT\t%QUAL\t%FILTER\t%FORMAT\t%treatmentSample\t%INFO\t% AC\t%AF\t%AN\t% MLEAC\t% MLEAF\t% MQ\t% QD\t%SOR\t%Location %Allele %VARIANT_CLASS  %Gene   %Feature    %SYMBOL %CCDS   %STRAND %IMPACT %Consequence    %SIFT   %PolyPhen   %CADD_phred %EXON   %DISTANCE   %COSMIC_ID  %COSMIC_CNT %AC %AN %CADD_phred %gnomAD_exomes_AF   %gnomAD_exomes_NFE_AF   %ExAC_nonTCGA_AF    %ExAC_nonTCGA_NFE_AF    %gnomAD_genomes_AF  %gnomAD_genomes_NFE_AF  %phyloP100way_vertebrate    %clinvar_rs %clinvar_clnsig %clinvar_trait  %clinvar_golden_stars   %Diag_Germline_Gene[\t%GT][\t%GQ][\t%DP][\t%MIN_DP][\t%AD][\t%VAF][\t%PL][\t%MED_DP]\n

I need to be something like the below table:

#CHROM  POS ID  REF ALT QUAL    FILTER  AC  AF  AN  DP  ExcessHet   FS  MLEAC   MLEAF   MQ  QD  SOR Location    Allele  VARIANT_CLASS   Gene    Feature SYMBOL  CCDS    STRAND  IMPACT  Consequence SIFT    PolyPhen    CADD_phred  EXON    FORMAT  treatmentSample
chr1    857100  .   C   T   1756.06 PASS    2   1   2   60  30.103  0   2   1   60  29.27   1.812   chr1:857100 T   SNV ENSG00000228794 ENST00000445118 LINC01128       1   MODIFIER    non_coding_transcript_exon_variant              45051   GT:AD:DP:GQ:PL  0/1:39,11:50:99:172,0,1122
vcf bcftools vep • 580 views
ADD COMMENT
0
Entering edit mode

I TRIED TO USE bcftools +split-vep AND AWK BUT THEY DIDINT WORK.

read https://meta.stackexchange.com/questions/147616/

ADD REPLY
0
Entering edit mode

Right sorry I added the error.

ADD REPLY

Login before adding your answer.

Traffic: 1335 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6