Hard filtering vcf files
Entering edit mode
3.0 years ago
Kash ▴ 110

Hi all,

I am trying to find a set of variants for a non-model organism that can be used as a known set (well-curated training/truth resources) in VQSR.

I have whole genome sequence bam files from 96 animals. I am thinking of doing hard filtering as advised here https://software.broadinstitute.org/gatk/documentation/article.php?id=3225 in broad institute website. I plan to perform bootstrapping method. I generated gvcf files for each bam file(96).

Now I am not sure whether to generated separate vcf files(96) for each bam file and do the bootstrapping separately and combine the final trained vcf files (96) into a single file to come up with the known truth set of variants. Or call CombineGVCFs on 96 gvcf files and generate a single gvcf file and a single vcf file out of it. Then use that single vcf file to bootstrap 96 bam files separately.

Any help is much appreciated.

SNP genome variant calling hard filtering GATK • 919 views

Login before adding your answer.

Traffic: 1023 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6