Question

RNA-Seq for DE analysis

3

Entering edit mode

8.0 years ago

debitboro ▴ 260

Hi all,

I have RNA-Seq PE data obtaining from the Illumina sequencing of 40 tumor tissues and their corresponding normal tissues (so, I have 2x2x40 = 160 fastq.gz files). I want to perform a DE analysis to detect the differences in expression between the normal and tumor tissues, so I ask for your help to propose me a convenient pipeline to use in such situation.

Thanks for all

RNA-Seq Differential Expression Analysis • 4.7k views

ADD COMMENT • link updated 8.0 years ago by CandiceChuDVM ★ 2.4k • written 8.0 years ago by debitboro ▴ 260

0

Entering edit mode

May be this paper will help you to understand the entire picture.

http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004393

ADD REPLY • link 8.0 years ago by GouthamAtla 12k

score 1 · Answer 1 · 2016-04-09

1

Entering edit mode

8.0 years ago

fernardo ▴ 170

I expect you already know how to come from alignment to gene counts.

Then use the following R packages to do DE analysis:

1- DEseq2

or

2- EdgeR

This pipeline/paper would definitely be a great help. Check it out.

Hope it helps.

ADD COMMENT • link 8.0 years ago by fernardo ▴ 170

score 1 · Answer 2 · 2016-04-09

1

Entering edit mode

8.0 years ago

Pam ▴ 30

Hi debitboro,

I assume that you just have the data and need help in analyzing them right from the beginning ?!

Ok first you need to quantify them. I personally use Sailfish which is very fast.

The count data obtained from Sailfish outfile can be used in some DE pipelines/packages (like DESeq2) as fernardo suggested.

ADD COMMENT • link 8.0 years ago by Pam ▴ 30

2

Entering edit mode

I would propose Salmon here which is much more updated version and and made by the same lab, and you can get estimated values of the raw counts for all your samples. It will be in a matter of few hours that you will get both the expression values (TPM) and Raw counts for each samples. Make a matrix for both TPM and Raw counts and then put the raw count to nearest integer by rounding in R and then you can use your desired tool for DE analysis, be it edgeR or DESeq2. It entirely depends on the user.

ADD REPLY • link 8.0 years ago by ivivek_ngs ★ 5.2k

2

Entering edit mode

Thanks for the mention, vchris_ngs! I should note here that, though Salmon includes features (and models certain types of bias like non-uniform read start distributions) that are not available in Sailfish, I still actively maintain Sailfish and backport the most relevant improvements from Salmon. This means that both Sailfish and Salmon should give highly accurate estimates very quickly. I intend to support and update both pieces of software as long as there is a user-base interested in me doing so, though I generally expect fancy new features to hit Salmon before Sailfish ;P.

ADD REPLY • link 8.0 years ago by Rob 6.5k

1

Entering edit mode

I am always interested in lightning fast methods that can help me do my DE analysis and then focus largely on the downstream analysis of the DE genes and your methods serves the purpose of giving me both expression and raw read counts. If the OP needs a helper script for creating a matrix file from all samples can write me here and I can provide.

ADD REPLY • link 8.0 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

Actually, you should not use Sailfish output directly for DESeq2 (see discussion here: https://support.bioconductor.org/p/63103/ ). You need to do some additional processing with something like tximport ( http://bioconductor.org/packages/devel/bioc/vignettes/tximport/inst/doc/tximport.html ).

ADD REPLY • link 8.0 years ago by igor 13k

score 1 · Answer 3 · 2016-04-11

1

Entering edit mode

8.0 years ago

CandiceChuDVM ★ 2.4k

A conventional start would be to play with the Tuxedo suite following the instruction in the paper "Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks". It will guide you through the beginning of mapping to visualize the data.
The Tuxedo suite is as below: enter image description here

However, please be aware of the updated Tuxedo suite tools (e.g. Bowtie2, Hisat2, StringTie, Ballgown).

If you don't mind, I have collected some online courses and papers in my consistently updated post:
Up-to-date RNA Sequence Analysis Training/Courses/Papers?

ADD COMMENT • link 8.0 years ago by CandiceChuDVM ★ 2.4k

2

Entering edit mode

As an author of Cufflinks I strongly recommend that you switch to kallisto http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.3519.html and sleuth for differential analysis. You can start here http://pachterlab.github.io/sleuth/ with an introduction and example here https://rawgit.com/pachterlab/sleuth/master/inst/doc/intro.html

ADD REPLY • link 8.0 years ago by Lior Pachter ▴ 700

0

Entering edit mode

Isn't kallisto/sleuth doing everything on the transcript level? Most biologists expect the results on gene level. That's the big caveat.

ADD REPLY • link 8.0 years ago by igor 13k

0

Entering edit mode

Using sleuth it is straightforward to examine quantification at the gene level. In a forthcoming release imminent there will be ability to perform differential analysis directly at the gene level as well as well.

ADD REPLY • link 8.0 years ago by Lior Pachter ▴ 700

0

Entering edit mode

Great! Do you have an estimate of when that will be available?

ADD REPLY • link 8.0 years ago by igor 13k

score 0 · Answer 4 · 2016-04-10

0

Entering edit mode

8.0 years ago

BioRyder ▴ 220

Hello,

The below biostars post will help you to know about RNA seq and DE. A: Up-to-date Online RNA Sequence Analysis Training/Courses/Papers?

ADD COMMENT • link 8.0 years ago by BioRyder ▴ 220

score 0 · Answer 5 · 2016-04-10

0

Entering edit mode

8.0 years ago

phil.chapman ▴ 90

I would recommend reading the F1000R article below by Mike Love (author of DESeq2) and Simon Anders (DESeq) which is a detailed workflow for analysing RNAseq data written by two o fthe leaders in the field:

http://f1000research.com/articles/4-1070/v1

ADD COMMENT • link 8.0 years ago by phil.chapman ▴ 90