Question: How to get scaffold information from Trinity?
1
gravatar for Paul
4.4 years ago by
Paul1.3k
European Union
Paul1.3k wrote:

dear all,

I am try to do de-novo assembly from my RNA-seq data.. first program which I used was SoapDeNovo-Trans and it works very well and there is provide some very good statistic information about scaffold and contigs..

Right now I am try to de-novo assembly with Trinity. My output is in FASTA file and does Trinity do scaffolding? I can find just information about contig N10-50 and number of transcripts and genes..

I would like to compare result form SoapDenovo-Trans and Trinity, but not sure how to get scaffold information from trinity output..

Or if you have any other experiences how to compare output from different programs please share it with me..

Many thanks for any help..

ADD COMMENTlink modified 4.4 years ago by Damian Kao15k • written 4.4 years ago by Paul1.3k
1
gravatar for Damian Kao
4.4 years ago by
Damian Kao15k
USA
Damian Kao15k wrote:

As far as I know, Trinity doesn't do scaffolding. PE reads in Trinity are used for bundling reads together, but are not used for scaffolding.

ADD COMMENTlink modified 4.4 years ago • written 4.4 years ago by Damian Kao15k

Thank you Damian for your comment.. Do you have any experiences with robust tools for scaffolding?

ADD REPLYlink written 4.4 years ago by Paul1.3k
1

There are plenty of tools for scaffolding genomes. But I don't think there are any specifically for transcriptomes. The reason is probably because most transcriptome assemblers usually output multiple isoforms of the same gene via various graph traversal algorithms. If you try to use the PE information to scaffold these transcripts, you tun the risk of fusing isoforms together.

ADD REPLYlink written 4.4 years ago by Damian Kao15k

Thank you Damian for comment and explanation. So do you recommend to keep just contigs for downstream analysis (TransDecoder etc..)?

ADD REPLYlink written 4.4 years ago by Paul1.3k

Trinity tends to output a lot of transcripts (I easily get over 150k transcripts). What you can do is annotate your transcripts via transdecoder/blast homology/hmms and then let people who are using your transcriptome decide whether they trust the transcript or not. You can also do some kind of arbitrary scoring system where you +1 score if a transcript has evidence from one of the annotation sources, and categorize your transcripts based on this scoring system.

ADD REPLYlink written 4.4 years ago by Damian Kao15k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1335 users visited in the last hour