Entering edit mode
5 hours ago
yesquokkan
•
0
Hello, I am a beginner graduate student who has just started learning bioinformatics.
I would like to perform SNP calling on a specific bacterial genome. Is it possible to do this at the assembly level?
From my understanding, using raw read data seems better, since SNP calling can then be based on quality scores, and factors such as coverage or differences in sequencing platforms could be considered—whereas using assembled genomes might be less reliable.
What is the general practice in this case?
If SNP calling is also possible using assembled genomes, which tools are typically used?
By aligning to a reference you can find the differences between your assembly and the reference.
Doing this could be problematic for some reasons. You no longer have the depth of sequences to confirm a particular difference (supported by multiple independent reads) and generate a confidence for the call. If your assembly is not properly done, you could end up with spurious SNP's.