VEP annotated VCF conversion into TSV/Text
1
0
Entering edit mode
16 months ago
Nai ▴ 50

VEP annotated VCF has CHROM, POS, REF, ALT , INFO, FORMAT : genotype. I would like to convert this VCF as tab delimited file where I can get INFO column will be tab delimited

Like Chr Pos ID Ref Alt Filter INFO ALLELE Consequence IMPACT SYMBOL Gene Feature_type Feature BIOTYPE EXON INTRON HGVS HGVSp cDNA_position|CDS_position Protein_position Amino_acids Codons Existing_variation ALLELE_NUM DISTANCE STRAND FLAGS VARIANT CADD etc All colums values will be splitted as tab separated

R bcftools VEP VCF • 2.2k views
ADD COMMENT
0
Entering edit mode

You have tagged this with bcftools already. Have you read the documentation for the split-vep plugin?

ADD REPLY
0
Entering edit mode

When I am doing with bcftools, it is showing only columns CHROM, REF, ALt etcc...I am not getting to split the other columns with headers like: ALLELE VARIANT FEATURE_type PROTEIN gnomADg_AF ....

ADD REPLY
0
Entering edit mode

I tried this bcftools +split-vep test/split-vep.vcf -f '%CHROM:%POS %CSQ\n' -d -A tab

But this command does not give any header information. I need header also like which column value belongs to which column.

ADD REPLY
0
Entering edit mode

The columns are in the same order that you specified them. Multiple values will be delimited by a comma unless you specify --duplicate which will generate one line per transcript.

ADD REPLY
0
Entering edit mode

Should I apply into this: bcftools +split-vep test/split-vep.vcf -f -duplicate '%CHROM:%POS %CSQ\n' -d -A tab

ADD REPLY
0
Entering edit mode
16 months ago
yussab ▴ 90

Hi Nai,

there are several ways to achieve this task, if you're familiar with R use this script Please let me know if you found it useful and remeber to mark as answered if it's ok :) Best regards, Youssef

#START

library(vcfR);

library(ensemblVEP);

library(dplyr)


#VEP Read files

vep_file = "VCF_vep.vcf"

vep_workir="~/results/02_vep/"

setwd(vep_workir)



vep_vcfr <- read.vcfR( vep_file, verbose = FALSE )

vep_header <- data.frame(vep_vcfr@meta)

vep_variants <- data.frame(vep_vcfr@fix)

vep_gt <- data.frame(vep_vcfr@gt)



#Parse into a GRanges and include the 'VCFRowID' column.

vep_ens <- readVcf(vep_file, "hg19")

csq_vep <- parseCSQToGRanges(vep_ens)

csq_vep <- data.frame(csq_vep)



VEP <- cbind.data.frame(vep_variants,csq_vep,vep_gt)

write.csv(VEP, file= output_path)

#END
ADD COMMENT
0
Entering edit mode

I have Bioconductor version 3.16 (BiocManager 1.30.19), R 4.2.2 Patched (2022-11-10 r83330). When installing ensemblVEP, it is not supported to latest version. Kindly help me to get something else.

ADD REPLY
1
Entering edit mode

Use conda to get all installation

ADD REPLY
0
Entering edit mode

Hi Nai, I suggest you to get the installation done with conda and try this script, it'll save you a look of time ;) I've already tried most of the other methods, including a script that I wrote myself

ADD REPLY

Login before adding your answer.

Traffic: 2744 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6