Filtering SNPs from a multi sequence alignment
1
0
Entering edit mode
7.9 years ago
natasha • 0

Hi

I have used the Harvest Tools package to align the core genome of ~50 bacteria, with the intention inferring their phylogeny based on SNPs. What criteria and software would be best to filter out any low quality SNPs?

Thanks

multiple-sequence-alignment SNP • 1.8k views
ADD COMMENT
0
Entering edit mode

If am not wrong there is Parsnp in Harvest components that will allow you to go for SNP filtration right? Did not you try that? In any case if you have vcf file as output from harvest after post the processing step then you can always filter with vcftools or vcflib for low quality variants.

ADD REPLY
0
Entering edit mode
7.9 years ago
Brice Sarver ★ 3.8k

If you are aligning to core genomes, apply standard variant filters based on your dataset (quality, min/max depth, QD, etc.). Standard tools include vcflib, vcftools, the GATK, etc. - there are many.

If you are aligning just core genomes, presumably all sites included are already confident. You don't need to necessarily filter SNPs, but you can identify them by looking for variable positions. But, if you have all of this information, why not make a tree using all the information you have? You will be able to more appropriately model among-site rate heterogeneity.

ADD COMMENT

Login before adding your answer.

Traffic: 1719 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6