Question

How to get scaffold information from Trinity?

1

Entering edit mode

9.1 years ago

Paul ★ 1.5k

dear all,

I am try to do de-novo assembly from my RNA-seq data.. first program which I used was SoapDeNovo-Trans and it works very well and there is provide some very good statistic information about scaffold and contigs..

Right now I am try to de-novo assembly with Trinity. My output is in FASTA file and does Trinity do scaffolding? I can find just information about contig N10-50 and number of transcripts and genes..

I would like to compare result form SoapDenovo-Trans and Trinity, but not sure how to get scaffold information from trinity output..

Or if you have any other experiences how to compare output from different programs please share it with me..

Many thanks for any help..

trinity scaffold de-novo transcript • 3.7k views

ADD COMMENT • link updated 9.1 years ago by Damian Kao 16k • written 9.1 years ago by Paul ★ 1.5k

score 1 · Answer 1 · 2015-03-22

1

Entering edit mode

9.1 years ago

Damian Kao 16k

As far as I know, Trinity doesn't do scaffolding. PE reads in Trinity are used for bundling reads together, but are not used for scaffolding.

ADD COMMENT • link 9.1 years ago by Damian Kao 16k

0

Entering edit mode

Thank you Damian for your comment.. Do you have any experiences with robust tools for scaffolding?

ADD REPLY • link 9.1 years ago by Paul ★ 1.5k

1

Entering edit mode

There are plenty of tools for scaffolding genomes. But I don't think there are any specifically for transcriptomes. The reason is probably because most transcriptome assemblers usually output multiple isoforms of the same gene via various graph traversal algorithms. If you try to use the PE information to scaffold these transcripts, you tun the risk of fusing isoforms together.

ADD REPLY • link 9.1 years ago by Damian Kao 16k

0

Entering edit mode

Thank you Damian for comment and explanation. So do you recommend to keep just contigs for downstream analysis (TransDecoder etc..)?

ADD REPLY • link 9.1 years ago by Paul ★ 1.5k

0

Entering edit mode

Trinity tends to output a lot of transcripts (I easily get over 150k transcripts). What you can do is annotate your transcripts via transdecoder/blast homology/hmms and then let people who are using your transcriptome decide whether they trust the transcript or not. You can also do some kind of arbitrary scoring system where you +1 score if a transcript has evidence from one of the annotation sources, and categorize your transcripts based on this scoring system.

ADD REPLY • link 9.1 years ago by Damian Kao 16k