Question: Strategy for transcriptomic assembly. What do you think is the best way to do it?
gravatar for Antonio R. Franco
4.1 years ago by
Spain. Universidad de Córdoba
Antonio R. Franco4.5k wrote:

I want to assemble a transcriptome using Trinity of a plant lacking of reference genome for running a RNA-Seq experiment afterwards

I have control plants and plants infected with a fungus for 2, 7 and 15 days. And for all these 4 conditions, I have plenty of Illumina paired reads of nice and good quality per separate.

In the way of getting the transcriptome, I have two possibilities - the opportunity of assembling control and infected plants for separate obtaining a total of 4 transcriptomes. - Or I can concatenate and join all of the reads in a common file and get a common transcriptome

These two possibilities are full of subtle considerations, and I just want to learn from your experiences.

transcriptome rna-seq trinity • 900 views
ADD COMMENTlink modified 4.1 years ago by Biostar ♦♦ 20 • written 4.1 years ago by Antonio R. Franco4.5k

Do you have the genome sequence of the fungus (or is it fungi as in multiple) available? If you do then you could bin the reads for the fungus(fungi) away from the plant and then do assembly on the pool of plant reads.

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by genomax84k

What is your end goal? Do you want to study the differentially expressed genes between your 4 conditions? If so, Trinity recommends building a consensus assembly, using all samples as inputs.

ADD REPLYlink written 4.1 years ago by

This fully answers my question

However, the more sequences you have isolated under different conditions, the harder have to be for any transcriptomic assembler to assemble, as you can mix different isoforms, complicate the assembling with more sequences, etc

ADD REPLYlink written 4.1 years ago by Antonio R. Franco4.5k

The DE pipeline for Trinity provides a RSEM perl wrapper that aligns the raw reads from each sample to the assembly. From this you will get both isoform and gene counts matrices which will tell you how many reads went into into that transcript from that sample. You can use these matrices for further downstream DE analysis in the pipeline.

ADD REPLYlink written 4.1 years ago by

Just a head's up - if you are looking for paralogs, it becomes difficult in case you mix the 4 samples. But if it's a more general analysis, pooling would give better assemblies

ADD REPLYlink written 4.1 years ago by Rohit1.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 937 users visited in the last hour