Question: From multiple VCF files to multiple sequence alignment?
0
gravatar for Peter vH
4.2 years ago by
Peter vH130
Peter vH130 wrote:

Hi there

I have multiple VCF files generated from variant calling on sequenced bacteria (M. tuberculosis). I would like to create a multiple sequence alignment file (as a step towards computing a phylogeny of the samples) by combining the reference genome with the VCFs. Before I put time and effort into creating a script to do this, is there an existing solution? I see that workflows such as SNPhylo compute an alignment with MUSCLE before doing tree construction - I'm trying to avoid that step.

Thanks, Peter

bacterial alignment vcf • 3.1k views
ADD COMMENTlink modified 4.0 years ago by Biostar ♦♦ 20 • written 4.2 years ago by Peter vH130

Please check this post. The comment by natasha provides a good solution

ADD REPLYlink written 4.2 years ago by microfuge1.9k

I'm not quite sure how? The tools suggested in those threads, vcf-consensus and FastaAlternateReferenceMaker in the other, produce a single FASTA output from a single VCF input and don't deal with gaps created when considering the alignment between sequences having insertions and deletions.

ADD REPLYlink written 4.2 years ago by Peter vH130

I have not used FastaAlternateReferenceMaker but iterated vcf-consensus -s <sample_name> to generate fasta file for each sample and then do the alignment. The new version also used IUPAC codes so that heterozygous genotypes can be encoded. Gaps are usually ignored in alignment so should not matter but I explicitly don't know how indels and rearrangements are handled by vcf-consensus.

ADD REPLYlink written 4.2 years ago by microfuge1.9k

So you'd do iterative vcf-consensus followed by MUSCLE? SNPhylo seems to do something like that. I'll experiment and compare it with the script I've written.

ADD REPLYlink written 4.2 years ago by Peter vH130

Yes. But it was a chloroplast genome and results were good. The advantage being no heterozygous as heteroplasmy was not detected.

ADD REPLYlink written 4.2 years ago by microfuge1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1173 users visited in the last hour