Strategies with annotated VCF files to filter very big VCF from WGS
0
0
Entering edit mode
12 weeks ago
randyOrlando ▴ 20

Greetings,

I have two types of big VCF files from WGS, and I am trying to wrap my head around which way to regarding variant prioritization:

  • Big VCF without annotations but all samples files and ranging in 50gb sizes;
  • Way smaller VCF, with annotations from VEP, in the range of 500mb at most.

For variant prioritization, should variants be selected according to their annotations, and then use the selected variants with tools like bcf/vcftools to filter the big VCF, and ultimately check the presence of those variants in the sample data?

Are there better strategies or tools for the matter at task?

Thank you for your help/interest

WGS VCF whole-genome-sequencing • 450 views
ADD COMMENT
0
Entering edit mode

Your VCF files are not comparable if they don't both contain the same set of samples. Annotations are locus specific, so VEP's presence won't really help much in determining which VCF to pick. What would help is the end goal of your prioritization task - what are you trying to achieve or optimize?

ADD REPLY
0
Entering edit mode

Sorry, I should have specified it further. The small VCF files with annotations come from the big VCF files. I guess samples were dropped from the small annotated VCF to reduce space. I want to perform variant prioritization to calculate polygenic risk scores. Thanks!

ADD REPLY
1
Entering edit mode

If you want to perform PRS calculations, you would need the genotypes for the variants found significant for your trait of interest. Their annotation should not be any concern but whether you're keeping the variants in scoring files.

Take a look at the PGS catalogs workflow: https://pgsc-calc.readthedocs.io/en/latest/

ADD REPLY
0
Entering edit mode

I don't know about PRS calculations but you should really look into how the smaller and larger files are related (even be able to reproduce the extraction process) before you move on to working with these files.

ADD REPLY

Login before adding your answer.

Traffic: 1747 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6