Hard filtering vcf files
0
0
Entering edit mode
2.1 years ago
kmkdesilva ▴ 90

Hi all,

I am trying to find a set of variants for a non-model organism that can be used as a known set (well-curated training/truth resources) in VQSR.

I have whole genome sequence bam files from 96 animals. I am thinking of doing hard filtering as advised here https://software.broadinstitute.org/gatk/documentation/article.php?id=3225 in broad institute website. I plan to perform bootstrapping method. I generated gvcf files for each bam file(96).

Now I am not sure whether to generated separate vcf files(96) for each bam file and do the bootstrapping separately and combine the final trained vcf files (96) into a single file to come up with the known truth set of variants. Or call CombineGVCFs on 96 gvcf files and generate a single gvcf file and a single vcf file out of it. Then use that single vcf file to bootstrap 96 bam files separately.

Any help is much appreciated.

SNP genome variant calling hard filtering GATK • 761 views
ADD COMMENT

Login before adding your answer.

Traffic: 2577 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6