Question

De Novo Transcriptome Assembly of multiple tissues: To pool or not to pool, that's the question?!

4

Entering edit mode

10.2 years ago

giorgiocasaburi ▴ 90

Hi all,

I have a dilemma and in the current literature there are controversial opinions. The nature protocol for Trinity also is not really clear about how to treat samples derived from different tissues, although it does suggest to pool together (prior to assembly) biological and technical replicates. I would like to hear a general opinion based on experience when it comes to assemble reads from different tissues.

A) Do you pool all the reads together and then run Trinity?

B) Do you run Trinity on every distinct data set (tissue) and then merge together the outputs with another assembler (e.g. cap3)?

I think it would be interesting to know what people thinks about this and what their general results.

Thanks in advance

~Giorgio

RNA-Seq transcriptome assembly trinity • 4.8k views

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 10.2 years ago by giorgiocasaburi ▴ 90

1

Entering edit mode

I am interested in what others have to say. I too have pooled reads from different tissues and assembled with velvet/oases. I then did FPKM predictions with cufflinks using reads from different tissues separately on the pooled assembly.

ADD REPLY • link 10.2 years ago by apelin20 ▴ 490

score 0 · Answer 1 · 2015-08-22

Giorgio

Two methods you said (A and B in your question) are practicable. If you have a large dataset, I suggest you to assembly them firstly for each tissue and then merged them together for timing saving. In Trinity pipeline, the auto-run scripts told us that they merged all reads firstly and then run the software. You can read it in trinity website.

Ram · Answer 2 · 2015-08-24

If the end goal is to perform a DEG analysis on the samples, then Trinity says to combine samples, assemble, and then one can perform DEG analysis. Otherwise you will have a difficult time creating the abundance tables with RSEM and analysing with edgeR. When you run the wrapper scripts for edgeR you can specifiy the sample conditions/tissues in your 'samples_described.txt' while specifying replicates.

I think you will get the best results merging your samples. If you have a lot of samples or really deep coverage, you can use the 'in silico read normalization' parameter to normalize your reads, which will help in processing time.

If you haven't done any QC on your samples, Trinity also supports Trimmomatic.