Downstream analysis on multi-sample or single-sample VCF files?

0

Entering edit mode

3.4 years ago

NGSCanBioinf ▴ 10

Hello,

I use GATK best practices in my analysis (mainly dnaseq pipeline) and as it is suggested the pipeline calls genotypes on all the samples together and at the end creates an "allSamples.vcf.gz" file.

At this stage one approach would be to perform filtering (e.g. removing low read depth variants) and annotation (e.g. gnomAD, CADD, etc.) on this multi-sample VCF or is it better to first break this VCF file into single-sample VCF files and do the downstream analysis on those?

One issue that I see with the first approach is that for each variant some samples could have enough read depth and not other ones so it comes down to choosing "variant-specific" filters or "sample-specific" filters. Would appreciate your feedback/suggestion on this matter.

next-gen VCF • 1.2k views

ADD COMMENT • link updated 7 months ago by Ram 43k • written 3.4 years ago by NGSCanBioinf ▴ 10

Login before adding your answer.