Hello all,
I have two genomes of the same species that I sequenced with nanopore that are around 1.5 Gb. I would like to infer the ROH in the genome. However, it is in scaffolds (around 400).
I aligned one genome with another and called the SNPs using clair3 (https://github.com/HKU-BAL/Clair3). Then I used bcftools roh -G30 outputclair3.vcf.gz --AF-dflt 0.4 > ROOH.vcf
However I only get ROH with state 0. So am I doing something wrong? And how would you suggest to approach the problem?
Also, would it be better do call ROH with only gene regions? So for instance do BUSCO on the reference genome and cat the single copy genes in one file. Then align the other genome to it. Call clair3 and then do ROH?
I also tried to call nucleotide diversity across all scaffolds with vcftools. But I think the image is not really correct? A lot of places do not have diversity at all while others have extremely high. How to mitigate this? (This is not per chromosome but per scaffold).