How to convert VCF (with possible predicted gene effects) to protein fasta/MSA
1
0
Entering edit mode
15 months ago
William ★ 5.3k

How to convert VCF (with possible predicted gene effects) and multiple samples to protein fasta/MSA

Input:

  • VCF (possibly with already gene/protein effects predicted via e.g. SnpEff)
  • GFF3 (for the reference protein sequence and maybe to predict effects)

Output:

  • protein fasta (1 or 2 sequences per sample in the VCF (2 sequences for heterozygous samples))

Is there any tool that can do this? command line or in python/R code?

gff3 fasta protein VCF • 1.0k views
ADD COMMENT
0
Entering edit mode
15 months ago
Emily 23k

Use the VEP with the ProteinSeqs plugin.

ADD COMMENT
0
Entering edit mode

This only seems to create the protein sequence per variant, not per sample (with possibly multiple variant effects included based on sample genotype(s)).

ADD REPLY
1
Entering edit mode

If you've got phased genotypes then you want Haplosaurus

ADD REPLY

Login before adding your answer.

Traffic: 3825 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6