I´m trying to identify variants from two groups composed of 25 samples (12 samples for the control and 12 samples for the treatment).
I´m using HaplotypeCaller as follows. My reference genome is a bacterial genome. All my samples are haploid.
gatk HaplotypeCaller --native-pair-hmm-threads $cpus --sample-ploidy 1 --reference $REF -I 1.mapped_sorted_RG.bam -I 2.mapped_sorted_RG.bam.... -I 25.mapped_sorted_RG.bam --output variants/all.vcf
All bam files have an RG tag (see exemple below).
... MC:Z:150M MD:Z:30A68A48 RG:Z:Sample1 NM:i:2 AS:i:138 XS:i:0
I´m getting this warning. Not sure if it is normal ???
18:19:47.882 WARN DepthPerSampleHC - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
The final idea is to compare samples from the two groups and identify all variants that are found at least in 30% of each group.
How would you do that ?