Question

Strategy for transcriptomic assembly. What do you think is the best way to do it?

0

Entering edit mode

8.0 years ago

Antonio R. Franco ★ 5.1k

I want to assemble a transcriptome using Trinity of a plant lacking of reference genome for running a RNA-Seq experiment afterwards

I have control plants and plants infected with a fungus for 2, 7 and 15 days. And for all these 4 conditions, I have plenty of Illumina paired reads of nice and good quality per separate.

In the way of getting the transcriptome, I have two possibilities - the opportunity of assembling control and infected plants for separate obtaining a total of 4 transcriptomes. - Or I can concatenate and join all of the reads in a common file and get a common transcriptome

These two possibilities are full of subtle considerations, and I just want to learn from your experiences.

Trinity Transcriptome RNA-Seq • 1.6k views

ADD COMMENT • link updated 8.0 years ago by Biostar 20 • written 8.0 years ago by Antonio R. Franco ★ 5.1k

1

Entering edit mode

Do you have the genome sequence of the fungus (or is it fungi as in multiple) available? If you do then you could bin the reads for the fungus(fungi) away from the plant and then do assembly on the pool of plant reads.

ADD REPLY • link 8.0 years ago by GenoMax 141k

1

Entering edit mode

What is your end goal? Do you want to study the differentially expressed genes between your 4 conditions? If so, Trinity recommends building a consensus assembly, using all samples as inputs.

ADD REPLY • link 8.0 years ago by st.ph.n ★ 2.7k

0

Entering edit mode

This fully answers my question

However, the more sequences you have isolated under different conditions, the harder have to be for any transcriptomic assembler to assemble, as you can mix different isoforms, complicate the assembling with more sequences, etc

ADD REPLY • link 8.0 years ago by Antonio R. Franco ★ 5.1k

0

Entering edit mode

The DE pipeline for Trinity provides a RSEM perl wrapper that aligns the raw reads from each sample to the assembly. From this you will get both isoform and gene counts matrices which will tell you how many reads went into into that transcript from that sample. You can use these matrices for further downstream DE analysis in the pipeline.

ADD REPLY • link 8.0 years ago by st.ph.n ★ 2.7k

0

Entering edit mode

Just a head's up - if you are looking for paralogs, it becomes difficult in case you mix the 4 samples. But if it's a more general analysis, pooling would give better assemblies

ADD REPLY • link 8.0 years ago by Rohit ★ 1.5k