Question: Why a de novo assembler is not a transcript assembler?
gravatar for biocyberman
3.9 years ago by
biocyberman760 wrote:

There are numerous de novo assemblers such as ABySS, SPAdes, AMOS, etc. And there is only one transcript assembler that I know of: Inchworm in a set of three tools of Trinity

I would like to have your thoughts about why those de novo assembler can or cannot be used as a transcript assembler. This is to understand the background of this matter. 


rna-seq assembly • 1.7k views
ADD COMMENTlink modified 3.9 years ago by SusanRey0 • written 3.9 years ago by biocyberman760

I am aware of another transcriptome assembler: IDBA-Tran.

ADD REPLYlink written 3.9 years ago by sentausa630
gravatar for thackl
3.9 years ago by
thackl2.6k wrote:

Genomic and transcriptomic data are quite different in some fundamental aspects, here a incomplete list:

Coverage: typical short read assembler (including those you mentioned) use kmer based graph structures for the reconstruction of the underlying sequence. For genomes, the kmer-coverage profile is quite distinct, with kmers form non-repetitive regions clustering around the sequencing depth of your sample, errorneous kmers at low frequencies and repeat stuff at high frequencies. This spectrum is used in assemblers during error correction, graph optimization /evaluation etc. The underlying assumptions, however, are not true at all for RNA-seq data. Here, the abundance of each transcript determines the frequency of corresponding kmers and you will get very different spectra.

Structural variants: In a (haploid) genome data set, you don't expect a lot of structural variances, and if you do, you often want to merge them into a single haplotype assembly. In transcriptomes, quite the opposite is the case. Alternative splicing produces a plentitude of structural variants for the same regions. This characteristic cannot be captured with denovo genome assemblers and most likely will result in individual fragments corresponding to single exons.

(There are other transcriptome assemblers, e.g.: OASIS)

ADD COMMENTlink written 3.9 years ago by thackl2.6k
gravatar for Damian Kao
3.9 years ago by
Damian Kao15k
Damian Kao15k wrote:

You should clarify how you are differentiating a "de novo assembler" and a "transcript assembler". I feel like there might be some confusion on the usage of these terms. Do you mean to say genome assembler vs transcriptome assembler?

The biggest difference between genome and transcriptome assembly is coverage. Barring repetitive or highly conserved regions, a genome ideally would have even coverage. A transcriptome, on the other hand, have differential coverage across each transcript depending on it's expression. Think of each transcript as a whole "genome" and a transcriptome assembly as trying to assemble many mini-genomes from a pool of mixed genomic reads (like a meta-genomic assembly). 


ADD COMMENTlink written 3.9 years ago by Damian Kao15k

@Damian Kao: You are right that a transcript(ome) assembler is technically a de novo assembler. I used the term "de novo assembler" out of common sense. With that said, however, I think I got the logic right: Not all de novo assemblers are transcript assemblers. Isn't that "genome assembler" also include "reference assembler" and "de novo assembler". I am not trying to argue, just want to understand the point.  

Thanks for the hint about coverage and the analogy of transcripts as mini-genomes. 

ADD REPLYlink written 3.9 years ago by biocyberman760

I think that a distinction between genome and transcriptome assemblers is more informative. All genome assemblers are "de novo assemblers"; whereas transcriptome assemblers can be classified into "de novo assembly" and "reference assembly". 

There are actually a couple of de novo transcriptome assemblers (SOAPdeNovo, Trans-Abyss, Trinity, Velvet-Oases) and the only reference transcriptome assemblers I can think of is Cufflinks and the recent StringTie.

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by Damian Kao15k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1310 users visited in the last hour