Question: RNA-seq analysis between three conditions
0
gravatar for pagetbailly.philippe
18 months ago by
pagetbailly.philippe0 wrote:

Hi everyone,

The problem I have isn't really a bioinformatic problem but I came across a hard nut to crack when analyzing my RNA-seq results. I had 3 conditions in triplicate: - 1 = control - 2= expressing 2 isoforms of a protein - 3= expressing 1 isoform of the protein

Seq generated 50 million 75 bp single end reads per sample.

The DE analysis I realized using cuffdiff gave me the following results: - 1 vs. 2 = 400 DEG - 1 vs. 3 = 50 DEG - 2 vs. 3 = 800 DEG

The same .BAM files analyzed using DESeq2 gave me more "marked" results: - 1 vs. 2 = 700 - 1 vs. 3 = 10 DEG - 2 vs. 3 = 2300 DEG

Here both analyzes show negligible DEG between condition 1 and 3. I know that DE analysis has the purpose to highlight significantly deregulated genes but to have so much DEG between 1 vs 2 and 2 vs 3 means there is something going on in my third condition right ? even tho there is not much DEG between control and third condition ? I don't know how to put this results into words.

Does anyone has encountered a similar situation ?

Thanks for the help you can provide !

rna-seq cuffdiff de • 529 views
ADD COMMENTlink written 18 months ago by pagetbailly.philippe0

Hi,

Is the isoform in your third condition the same as one of the isoforms in your second condition?

And regardless of my first question, why are two isoforms clubbed in your second condition? 1 vs. 2 = 400 DEG might have been different if the isoforms were kept as two different conditions.

ADD REPLYlink written 18 months ago by vinayjrao140

Hi, in deed the isoform in the third condition is the same as the one in the second condition. Long story short, the two isoforms come from an alternative splicing. The isoform2 from the third condition was shown to have elusive effects in the litterature yet it accounts for 90% of mRNA from the gene. On the contrary, isoform1 an its effects are very well caracterized. e were unable to generate cellular clones expressing only the first unspliced isoform because inhibiting splicing increases by 10 fold the expression of the unspliced which is lethal for the cells unfortunately ...

Like you suggest the best experimental set up would be one condition for each isoform. But we had to chose this one so we have: - control - 10% isoform1 / 90% isoform2 (closest to in vivo) - isoform2 (our interest)

ADD REPLYlink written 18 months ago by pagetbailly.philippe0

Hi,

I am not very much convinced by the method employed, but I am myself an amateur in the field. Although, looking at the number of DEs across your condition, I want to know what was the quality filter applied while selecting the reads (Phred Score), and are the number of reads obtained from all three sets similar?

Edit: Another option would have been to add a 4th condition - isoform 1

ADD REPLYlink modified 18 months ago • written 18 months ago by vinayjrao140

Hi, the average Phred score was 33. (40 million reads above 32). We obtained 50 million reads for each replicate (46 min to 57 max).

This fourth condition would have been nice indeed.

ADD REPLYlink modified 17 months ago • written 17 months ago by pagetbailly.philippe0

Hi,

The number of reads in the three sets are similar, with a good Phred score cut-off. I'm sorry, but I can't think of a conclusive reason for your results, although I would suggest you to repeat your analysis using another pipeline.

ADD REPLYlink written 17 months ago by vinayjrao140

I'm starting to think that DEG analysis can't answer the question i'm asking.

Thanks you very much for your time !

ADD REPLYlink written 17 months ago by pagetbailly.philippe0

Maybe not, but now that you have already invested time in it, you should try the analysis with another pipeline. You would at least know if there was an error in the pipeline, or in the analysis.

ADD REPLYlink written 17 months ago by vinayjrao140

Can you give me any recommandation ? I'm fairly new to bioinformatics :)

ADD REPLYlink written 17 months ago by pagetbailly.philippe0

Sure. You could try the hisat2 protocol (new tuxedo protocol), and also other aligners and mergers.

https://www.nature.com/articles/nprot.2016.095

https://www.nature.com/articles/nprot.2013.099#procedure

These are two established pipelines, and you could try analyzing the example data to be sure you understand what's going on.

Good luck :)

ADD REPLYlink written 17 months ago by vinayjrao140

I will try these. Thanks again !

ADD REPLYlink written 17 months ago by pagetbailly.philippe0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2050 users visited in the last hour