I am looking for a good workflows, readings or tutorial for SNP calling. I read some other posts on this topic, but I would like a more detailed explanation. Population genomics and sequence data are new to me (I have a general CS and biology background). It might just be me, but these tools are not as straightforward or as documented as I'd like. Any links or explanations would be good!
So far, my situation is as follows:
- I have Illumina sequence reads for a highly polymorphic species
- I aligned these reads using BWA against a reference genome with default parameters, but I am not sure if I should change parameters (if so, which ones?) due to the highly polymorphic data
- I am unsure of the next step, I will probably be using SAMtools or GATK... I tried making an mpile up but got really confused after that.
- I should also be accessing SNP quality..what tools are used for that? I already see some sequencing errors when browsing the data.
As you can tell, I am totally new with this. It is pretty exciting so I want to learn and be able to do some of these things! Thanks in advance.
edit: I also get so confused with some of the output, more detailed documentation on that would be nice as well!