Question: Annotate SNPs called from Trinity transcriptome assembly using annotations from Trinotate pipeline
gravatar for TrentGenomics
12 weeks ago by
TrentGenomics20 wrote:


I have a VCF file containing SNPs called between a Trinity reference assembly and an alignment file, generated by samtools/bcftools.

I have annotated the Trinity reference assembly using the Trinotate pipeline (blastx Trinity transcripts against swissprot, blastp TransDecoder predicted proteins from Trinity transcripts against swissprot, and HMMER TransDecoder predicted proteins from trinity transcripts against Pfam).

Now, I would like to annotate the SNPs contained in my VCF file using my annotated Trinity reference assembly. Is this possible?

My Trinity transcripts in the reference assembly are formatted like so:

>TRINITY_DN1000|c115_g5_i1 len=247 path=[31015:0-148 23018:149-246]

So I can't use a program like snpEff as that program leverages reference genomes that have a chr,pos format.

Has this been done before? Any info greatly appreciated as always. Thanks!

blast rna-seq assembly • 237 views
ADD COMMENTlink modified 6 days ago by colindaven390 • written 12 weeks ago by TrentGenomics20
gravatar for colindaven
6 days ago by
colindaven390 wrote:

Nope, can't be done as far as I know.

If you have a reference genome you can map reads to that and call against that.

Otherwise, you could insert the SNPs into the ref transcriptome - tools are available, like "seqtk mutfa". Then use biopython etc to translate your mutated transcripts into protein.

By the way, I really like interproscan for functional annotation of tx sets.

ADD COMMENTlink written 6 days ago by colindaven390
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1060 users visited in the last hour