Hi,
I am confused about variants (mainly SNPs and InDels) detected from variant calling and de novo genome assembly. Of course, using which type of data should be based on the nature of the study. I just want to get some conceptual comparison of them. For example, I am interested in the mice strain SPRET. I found the genome sequencing data in https://www.sanger.ac.uk/science/data/mouse-genomes-project which includes variant calling recording in VCF files and they also provides genome assembly of the strain.
- I assume that they have used the same sequencing data to make the variant calling and assembly? So they are both in similar depth?
- If I used the
bcftools consensus
to patch the mm10 genome with the SNP and InDel data in VCF, how will it compare to the genome assembly? More precisely, what quality of the nature variants from the VCF file will be incorporated into the genome assembly?
I am new to this area. My description problem might not be accurate. I am appreciated of any of your help.