Question

intersection of multiple VCF files

0

Entering edit mode

5.0 years ago

JoeDoasi ▴ 10

Hello,

While doing a variant assessment for patients exomes (~16 vcf files - 16 patients), I find that some variants predicted to be pathogenic, however, the phenotype is not associated with these variants.

I also found that some pathogenic predicted variants do exist in more than a patient! so I'm thinking of doing an intersection of all the vcf files and use file containing the common variants in my variant assessment workflow!

Is this approach OK?
I want to build up a database for the common variants in our population, will this strategy help?
What are the recommended tools?

Appreciated to your usual help!

next-gen sequencing SNP genome gene • 2.4k views

ADD COMMENT • link updated 5.0 years ago by emeline.a.favreau ▴ 30 • written 5.0 years ago by JoeDoasi ▴ 10

score 1 · Answer 1 · 2019-05-01

1

Entering edit mode

5.0 years ago

emeline.a.favreau ▴ 30

As I understand, your current question is: is there any variant predicted to be pathogenic?

Looking at variants that are common to all patients is a good way of investigating your dataset at the population level. You will filter out variants that are only found in few patients, which will remove some noise in your data. You will be (hopefully) left with enough common variants to draw some conclusions.

I would use BCFTools isec, see here as well: Intersect multiple VCF files.

I would test the tool with only 2 VCFs just to make sure that the output is what I wanted, before running the command for the 16 VCFs.

Good luck with your filtering!

ADD COMMENT • link 5.0 years ago by emeline.a.favreau ▴ 30

1

Entering edit mode

A general word of caution before embarking on a bcftools journey: Run bcftools norm -m to split multi-allelics and if possible, left-align and normalize indels. Saves you a lot of headache downstream when dealing with partial overlaps with multi-allelic variants and "indels" at tandem repeat loci.

ADD REPLY • link 5.0 years ago by Ram 43k

0

Entering edit mode

Thank you RamRS. I will consider it in the coming analyses

ADD REPLY • link 5.0 years ago by JoeDoasi ▴ 10

0

Entering edit mode

Thank you Emeline for your reply. I used vcftools for that but now I will use BCFTools and compare both outputs.

ADD REPLY • link 5.0 years ago by JoeDoasi ▴ 10