VEP annnotation on millions of variants
1
0
Entering edit mode
21 months ago
Pac314 ▴ 10

I am trying to perform VEP annotation on around 8 million small variants but the process takes around 10 hours using 14 threads and 64 GB RAM. Are there any feasible ways for reducing the time taken to run VEP annotation? I have read that excluding HGVS nomenclature annotation reduces run times, however I am reluctant to exclude HGVS annotation as HGVS reporting is considered a standard requirement for clinical reporting of variants.

variants VCF annotation VEP • 692 views
ADD COMMENT
2
Entering edit mode
21 months ago
  • convert your vcf to bed (eg: Mince a vcf into n bins of a given range )
  • annotate each region using bcftools view --regions-file the.bed | vep in parallel using your favourite workflow manager ( snakemake, nextflow...)
  • merge all annotated vcf using bcftools concat
ADD COMMENT
0
Entering edit mode

Thanks for your reply, I will try to mince the VCF and annotate the regions in parallel. How many threads and memory do you think would be suitable for parallel annotation of these variants?

ADD REPLY
0
Entering edit mode

How many threads and memory do you think would be suitable for parallel annotation of these variants?

42 ...

otherwise, it depends of your infrastructure (cluster, cores, memory...)

ADD REPLY

Login before adding your answer.

Traffic: 3020 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6