Problems associated with handling of missing characters on bcftools consensus and vcf-consensus
Entering edit mode
19 months ago

I intend to construct a species-level phylogeny using an exome dataset which has multiple individuals sampled for each species. I have generated a vcf file for each species post the mpileup step. However, when I use the script, the script seems to split the sequences at the individual level yet again (I have multiple tips corresponding to each individual sampled for all species on the phylogeny).

If your sequencing has missed spots in comparison to the reference, these tool (bcftools consensus and vcf-consensus) replaces the character "N" on the VCF file with corresponding spots from the reference when I create a fasta file. This alters the distance matrix and by default makes it more closely related to the reference than it actually is. How do I fix this?

missingcharacters bcftools vcf-consensus • 699 views

Login before adding your answer.

Traffic: 1960 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6