Question

RNA-Seq Differential gene expression

0

Entering edit mode

5.3 years ago

umeshtanwar2 ▴ 30

Hi, I am working on plant Arabidopsis. I have the samples for 4 plant types (WT, KO1, KO2 and OE1) in three replicates, Mock treated and after stress treatment. So in total 24 samples (4 Mock and 4 Treated, 3 replicates of each). I would like to analyse the differential gene expression in each sample type before treatment (Mock) and after treatment. I am using STAR for alignment of the raw reads on reference genome and to output alignments translated into transcript coordinates in the Aligned.toTranscriptome.out.bam file. These files I am using for transcript quantification on RSEM. But I am getting a problem:

Cannot open /Documents/Umesh/Analysis123/star/GENOME/genome.grp! It may not exist.

So I seek help for: 1. Is this the right approch? 2. Can I use other programs for DGE analysis 3. What is the proper pipeline for this experiment? Since I am confused about so many tools. Please guide me as I am new to this kind of analysis.

Thanks Umesh

RNA-Seq • 1.5k views

ADD COMMENT • link 5.3 years ago by umeshtanwar2 ▴ 30

0

Entering edit mode

One straightforward approach would be to use STAR to generate a count table that you can use with edgeR. Look up the edgeR userguide - it has a rich description of how to quantify differential gene expression given various experimental designs.

ADD REPLY • link 5.3 years ago by seidel 11k

0

Entering edit mode

Thank you seidel for you kind suggestion. I will go through the edgeR userguide.

ADD REPLY • link 5.3 years ago by umeshtanwar2 ▴ 30

0

Entering edit mode

But feeding the transcriptome file to RSEM and using the rounded expected counts from RSEM works fine too. It might work a bit better, because RSEM is smarter about reads which align to multiple features.

ADD REPLY • link 5.3 years ago by swbarnes2 14k

0

Entering edit mode

To be honest, I've used Salmon (or eXpress) more than RSEM, but I thought maybe I should mention my own experience with RSEM seemed different than what I expected from reading benchmark papers. Namely, I believe the issue with RSEM was that either i) the result seemed a little strange if I perform the Bowtie alignment first (with separate RSEM quantification afterwards) or ii) the alignment from within RSEM (with the default parameters, I believe) seemed to take a prohibitively long time (where I didn't wait to test quantification with multiple samples).

In general, I would recommend expecting a substantial amount of time for each project (including testing of at least some different methods for each project). Most commonly, I would perform a genome alignment, with STAR or TopHat (where I think visually inspecting the alignment can be a useful troubleshooting strategy).

In other words, I think starting with STAR is OK (and closer to what I would do for "initial" analysis), but there isn't really one "best" strategy that you can use without taking the time to critically assess your data.

ADD REPLY • link 5.3 years ago by Charles Warden 8.2k

1

Entering edit mode

My limited experience with doing STAR alignment inside of RSEM was that the baked-in STAR parameters were really stringent.

ADD REPLY • link 5.3 years ago by swbarnes2 14k

0

Entering edit mode

There are many choices for doing DGE analysis on RNA-seq data, now supported by many papers, one of the most popular methods is using DESeq2. If you want a fast alternative and have transcriptome ready then use Salmon + tximport + DESeq2. A more detailed tutorial here http://www.sthda.com/english/wiki/rna-seq-differential-expression-work-flow-using-deseq2

ADD REPLY • link 5.3 years ago by hiraksarkar.cs • 0