Question: How to select the best isoform for a differential expressed gene in Trinity?
1
gravatar for upendrakumar.devisetty
4 months ago by
United States
upendrakumar.devisetty370 wrote:

Hi,

I have followed Trinity guidelines and assembled a denovo assembly as well as used that assembly as a reference to quantify the reads using RSEM and finally followed it with Differential expression analysis using DESeq2. For Differential expression analysis, I have used RSEM.genes.results rather than the RSEM.isoform.results, since I was not sure using isoform level expression, is accurate or not compared to gene-level expression.

But the problem now is how do I select the best transcript/isoform for the Differential expressed genes because without that I cannot extract the sequence from Trinity assembly as the Trinity assembly has sequences for isoforms and not genes.

I have thought of doing this in several ways - selecting the longest isoform, clustering all the isoforms, and then select the longest isoform but I was wondering if I can get a consensus of all the isoforms for the gene of interest.

deseq2 trinity • 235 views
ADD COMMENTlink modified 4 months ago by h.mon31k • written 4 months ago by upendrakumar.devisetty370
3
gravatar for h.mon
4 months ago by
h.mon31k
Brazil
h.mon31k wrote:

Using isoforms instead of genes is not incorrect per se, but it is noisier (particularly so for a de novo assembled transcriptome) and needs larger sample sizes and deeper sequencing per sample, so as ATpoint already said, I would indeed recommend gene-level expression analysis.

There is no need to use tximport, as its method has been implemented in Trinity. The RSEM.genes.results should be identical to importing the counts with tximport.

What is the "best" isoform is open to debate, but I would argue the longest is not the best. I remember seeing Trinity authors recommending selecting the most expressed isoform, but I can't find the link.

but I was wondering if I can get a consensus of all the isoforms for the gene of interest.

It seems you want a "super-transcripts" representation of the transcripts, Trinity has implemented this as well, see the SuperTranscripts wiki page.

ADD COMMENTlink modified 4 months ago • written 4 months ago by h.mon31k

Thank you so much. You saved my day!!! Super-transcripts are what I was looking for :)

ADD REPLYlink written 4 months ago by upendrakumar.devisetty370
2
gravatar for ATpoint
4 months ago by
ATpoint40k
Germany
ATpoint40k wrote:

You should aggregate the transcript level abundance estimates to the gene level, e.g. with the tximport tool from Bioconductor. Gene level differential analysis is much more robust than differential transcript analysis, and in fact DESeq2 is not intended for the latter. There is no "best" isoform. tximport will summarize the transcript levels to a single gene level counts which are then being analyzed with DESeq2.

ADD COMMENTlink written 4 months ago by ATpoint40k

Thanks for your comment. I did that already. But the question is once I have the list of differential expressed genes (not isoforms), how do I go back and extract the sequence-specific for the differential expressed gene from the Trinity assembly? The Trinity assembly consists of isoforms and not genes.

ADD REPLYlink written 4 months ago by upendrakumar.devisetty370
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1001 users visited in the last hour