I am working with GATK best practices pipeline and am at a point of making a joint call on vcfs across all my samples in my cohort. I intend to do a germ line analysis.
I may be over thinking here but I had these questions:
- I want to specifically screen variants in certain genes only. Do I need to still merge vcfs ( joint call) that contain not only variants in the genes of interest but also other variants in other regions ? what I meant is can I just extract variants from chromosomal locations that span my genes of interest and use those vcfs to do a joint call? probably quicker to do.
- do I need to go ahead and do a joint call on the entire vcf across my cohort and take it to vqsr and then extract my variants in my region of interest? this seems overkill but not sure if I will miss anything if I strictly do a selective extraction as mentioned above?
Why unnecessarily run the computation time and resources if it is not necessary. Any guidance is greatly appreciated !!
Thank you in advance for your time and I appreciate your intent to help !