Question: RNA-Seq Differential gene expression
gravatar for umeshtanwar2
9 weeks ago by
umeshtanwar210 wrote:

Hi, I am working on plant Arabidopsis. I have the samples for 4 plant types (WT, KO1, KO2 and OE1) in three replicates, Mock treated and after stress treatment. So in total 24 samples (4 Mock and 4 Treated, 3 replicates of each). I would like to analyse the differential gene expression in each sample type before treatment (Mock) and after treatment. I am using STAR for alignment of the raw reads on reference genome and to output alignments translated into transcript coordinates in the Aligned.toTranscriptome.out.bam file. These files I am using for transcript quantification on RSEM. But I am getting a problem:

Cannot open /Documents/Umesh/Analysis123/star/GENOME/genome.grp! It may not exist.

So I seek help for: 1. Is this the right approch? 2. Can I use other programs for DGE analysis 3. What is the proper pipeline for this experiment? Since I am confused about so many tools. Please guide me as I am new to this kind of analysis.

Thanks Umesh

rna-seq • 166 views
ADD COMMENTlink written 9 weeks ago by umeshtanwar210

One straightforward approach would be to use STAR to generate a count table that you can use with edgeR. Look up the edgeR userguide - it has a rich description of how to quantify differential gene expression given various experimental designs.

ADD REPLYlink written 9 weeks ago by seidel6.8k

Thank you seidel for you kind suggestion. I will go through the edgeR userguide.

ADD REPLYlink written 9 weeks ago by umeshtanwar210

But feeding the transcriptome file to RSEM and using the rounded expected counts from RSEM works fine too. It might work a bit better, because RSEM is smarter about reads which align to multiple features.

ADD REPLYlink written 9 weeks ago by swbarnes24.9k

To be honest, I've used Salmon (or eXpress) more than RSEM, but I thought maybe I should mention my own experience with RSEM seemed different than what I expected from reading benchmark papers. Namely, I believe the issue with RSEM was that either i) the result seemed a little strange if I perform the Bowtie alignment first (with separate RSEM quantification afterwards) or ii) the alignment from within RSEM (with the default parameters, I believe) seemed to take a prohibitively long time (where I didn't wait to test quantification with multiple samples).

In general, I would recommend expecting a substantial amount of time for each project (including testing of at least some different methods for each project). Most commonly, I would perform a genome alignment, with STAR or TopHat (where I think visually inspecting the alignment can be a useful troubleshooting strategy).

In other words, I think starting with STAR is OK (and closer to what I would do for "initial" analysis), but there isn't really one "best" strategy that you can use without taking the time to critically assess your data.

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by Charles Warden6.4k

My limited experience with doing STAR alignment inside of RSEM was that the baked-in STAR parameters were really stringent.

ADD REPLYlink written 9 weeks ago by swbarnes24.9k

There are many choices for doing DGE analysis on RNA-seq data, now supported by many papers, one of the most popular methods is using DESeq2. If you want a fast alternative and have transcriptome ready then use Salmon + tximport + DESeq2. A more detailed tutorial here

ADD REPLYlink written 9 weeks ago by hiraksarkar.cs0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2044 users visited in the last hour