Question: Is There Any Reason To Do De Novo Transcript Assembly If A Reference Is Available?
gravatar for Ryan Thompson
7.9 years ago by
Ryan Thompson3.4k
TSRI, La Jolla, CA
Ryan Thompson3.4k wrote:

I am working on some RNA-Seq data from Cynomolgus monkey. I was originally planning to do de novo transcript assembly, but then I realized that the Cyno genome has recently been released, so I can do reference-guided transcript assembly instead. However, I am wondering if there is any compelling reason to do de novo transcriptome assembly as well as or instead of reference-guided.

(By the way, my data is Illumina 2x100 with a 250 bp insert size, in case that makes a difference.)

trinity cufflinks rna denovo • 5.4k views
ADD COMMENTlink modified 7.9 years ago by Geparada1.4k • written 7.9 years ago by Ryan Thompson3.4k
gravatar for apa@stowers
7.9 years ago by
Kansas City
apa@stowers470 wrote:

In the paper for Trinity, which appears to be the current best-of-breed de novo transcriptome assembler, they compare their results to the two best-known reference-guided assemblers, Cufflinks and Scripture. In mouse, ref-guided was better (recovered more full-length genes + isoforms). In pombe, it was worse.

I don't remember why ref-guided was worse for pombe, or if it was even discussed, but I think it relates to the high density / low structural complexity of pombe genes -- Cufflinks and Scripture were designed with vertebrate transcriptomes in mind.

So if you're working in vertebrate, ref-guided is probably the best option.

ADD COMMENTlink written 7.9 years ago by apa@stowers470

Hey there, Ariel, long time no chat :)

ADD REPLYlink written 16 months ago by summerela120
gravatar for Mikael Huss
7.9 years ago by
Mikael Huss4.7k
Mikael Huss4.7k wrote:

No, I wouldn't say there is. Of course, it depends on what you mean by "having a reference". It is still necessary to try to de novo assemble highly variable regions such as the HLA region in the human genome, or poorly covered regions. For your case, I would prefer a reference guided assembly.

ADD COMMENTlink written 7.9 years ago by Mikael Huss4.7k
gravatar for Ahdf-Lell-Kocks
7.9 years ago by
Ahdf-Lell-Kocks1.6k wrote:

Considering the quality of your reference assembly, it may be worth to do both: ref-based alignment of the RNA-seq reads and at the same time de novo assembly. The de novo assembly will pick up a number of transcripts that cannot be reliably mapped to the reference due to gaps or miss-assembled regions, and complement the transcripts found on the ref-based set.

ADD COMMENTlink written 7.9 years ago by Ahdf-Lell-Kocks1.6k
gravatar for Darked89
7.9 years ago by
Barcelona, Spain
Darked894.2k wrote:

I guess none of the monkey genomes achieved the level of completion comparable to the human genome. But even in the best case scenario: human RNA-Seq + human genome there was a recent article showing that you can get novel transcripts. I will provide link later on.

Also you may get stuff which is not in the genome, like viruses infecting your sample.

So the answer would be: do both (guided and RNA-Seq assembly), probably always, maybe except some tiny, multiple times sequenced genomes.

ADD COMMENTlink written 7.9 years ago by Darked894.2k
gravatar for Geparada
7.9 years ago by
Geparada1.4k wrote:

If you are going to do a reference guided transcriptome assembly, you have to keep in mind that all the assembly errors of the reference genome will be reflected in your results. So, you should view the assembly statistics of this genome (like N50, number of scaffolds, etc) before take a decision. But otherwise, if you have well assembled genomes like hg19 or mm9, the guided way is the best option.

ADD COMMENTlink written 7.9 years ago by Geparada1.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 800 users visited in the last hour