Starting with pool-seq data, I'm aligning reads to the chloroplast genome and after filtering, using samtools to generate a consensus sequence for each of several populations. I would ultimately like to build a phylogeny from these sequences but am not sure of best practices between consensus sequence generation and input into standard phylogenetics tools.
I think I need to do the following:
- Generate a multiple sequence alignment
- Visually inspect the alignment and remove obvious errors (e.g. large indels with respect to the reference?)
- Verify SNPs (e.g. go back to the original alignment of reads for each population)
I'm wondering if anyone can provide some guidance as to which programs are most useful for these steps and the data visualization? Also any suggestions as to what types of errors to look out for given this data and how to decide on reliable SNPs would be greatly appreciated as I'm very new to the genomics end of things.