Entering edit mode
20 months ago
gubrins ▴ 260
I want to calculate genome-wide heterozygosity for a couple of samples I have. I have been looking in internet but I just find theoretical definitions of how you should do it, rather than practical examples. Is there any software or custom batch script already optimized? I have bam or vcf files!
Thanks in advance!
snippy can assist in reference-based genome-wide heterozygosity analysis but it needs fastq files (raw reads).
Hi @gubrins, did you every get anywhere with this? Did you use snippy, as suggested by @MSRS, or another tool? Looking at doing the same thing and wondering how you went about it. Cheers!
Heys clinnaeus , I didn't receive the notification from @MSRS so I didn't try it! However, I would prefer to find a way to calculate the heterozygosity through my complete vcf file, because as I did the snps calling with GATK, I imagine is going to be more realible and at the same time is the file I will use for other analyses. Let's keep in touch and try to solve this together if you want. From which type of file do you wanna calculate the heterozygosity?
Check this paper!
I'm on a course, so I'll try next week and I will let you know if works with the softwares they mention. Good luck!
Hi gubrins - thanks so much for replying! I'm definitely keen to try to work together. Sorry it took so long for me to reply, I didn't get a notification for your messages. I'm also interested in calculating the hetz from the vcf if possible - I've been thinking of using ROHAN to try to estimate global rates of hetz but haven't had time to implement yet. I'll have a go this week and let you know how it goes, and we can compare. Shall we connect via Twitter?
good to me, I'm also gubrins in twitter!