Question: RNA-Seq for DE analysis
2
gravatar for debitboro
3.4 years ago by
debitboro110
Belgium
debitboro110 wrote:

Hi all,

I have RNA-Seq PE data obtaining from the Illumina sequencing of 40 tumor tissues and their corresponding normal tissues (so, I have 2x2x40 = 160 fastq.gz files). I want to perform a DE analysis to detect the differences in expression between the normal and tumor tissues, so I ask for your help to propose me a convenient pipeline to use in such situation.

Thanks for all

ADD COMMENTlink modified 3.4 years ago by CandiceChuDVM1.9k • written 3.4 years ago by debitboro110

May be this paper will help you to understand the entire picture.

http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004393

ADD REPLYlink written 3.4 years ago by geek_y9.8k
1
gravatar for fernardo
3.4 years ago by
fernardo 130
Italy
fernardo 130 wrote:

I expect you already know how to come from alignment to gene counts.

Then use the following R packages to do DE analysis:

1- DEseq2

or

2- EdgeR

This pipeline/paper would definitely be a great help. Check it out.

Hope it helps.

ADD COMMENTlink written 3.4 years ago by fernardo 130
1
gravatar for Pam
3.4 years ago by
Pam30
Pam30 wrote:

Hi debitboro,

I assume that you just have the data and need help in analyzing them right from the beginning ?!

Ok first you need to quantify them. I personally use Sailfish which is very fast.

The count data obtained from Sailfish outfile can be used in some DE pipelines/packages (like DESeq2) as fernardo suggested.

ADD COMMENTlink written 3.4 years ago by Pam30
2

I would propose Salmon here which is much more updated version and and made by the same lab, and you can get estimated values of the raw counts for all your samples. It will be in a matter of few hours that you will get both the expression values (TPM) and Raw counts for each samples. Make a matrix for both TPM and Raw counts and then put the raw count to nearest integer by rounding in R and then you can use your desired tool for DE analysis, be it edgeR or DESeq2. It entirely depends on the user.

ADD REPLYlink written 3.4 years ago by ivivek_ngs4.8k
2

Thanks for the mention, vchris_ngs! I should note here that, though Salmon includes features (and models certain types of bias like non-uniform read start distributions) that are not available in Sailfish, I still actively maintain Sailfish and backport the most relevant improvements from Salmon. This means that both Sailfish and Salmon should give highly accurate estimates very quickly. I intend to support and update both pieces of software as long as there is a user-base interested in me doing so, though I generally expect fancy new features to hit Salmon before Sailfish ;P.

ADD REPLYlink written 3.4 years ago by Rob3.4k
1

I am always interested in lightning fast methods that can help me do my DE analysis and then focus largely on the downstream analysis of the DE genes and your methods serves the purpose of giving me both expression and raw read counts. If the OP needs a helper script for creating a matrix file from all samples can write me here and I can provide.

ADD REPLYlink written 3.4 years ago by ivivek_ngs4.8k

Actually, you should not use Sailfish output directly for DESeq2 (see discussion here: https://support.bioconductor.org/p/63103/ ). You need to do some additional processing with something like tximport ( http://bioconductor.org/packages/devel/bioc/vignettes/tximport/inst/doc/tximport.html ).

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by igor8.1k
1
gravatar for CandiceChuDVM
3.4 years ago by
CandiceChuDVM1.9k
United States/College Station/Texas A&M University
CandiceChuDVM1.9k wrote:

A conventional start would be to play with the Tuxedo suite following the instruction in the paper "Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks". It will guide you through the beginning of mapping to visualize the data.
The Tuxedo suite is as below: enter image description here

However, please be aware of the updated Tuxedo suite tools (e.g. Bowtie2, Hisat2, StringTie, Ballgown).

If you don't mind, I have collected some online courses and papers in my consistently updated post:
Up-to-date RNA Sequence Analysis Training/Courses/Papers?

ADD COMMENTlink written 3.4 years ago by CandiceChuDVM1.9k
2

As an author of Cufflinks I strongly recommend that you switch to kallisto http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.3519.html and sleuth for differential analysis. You can start here http://pachterlab.github.io/sleuth/ with an introduction and example here https://rawgit.com/pachterlab/sleuth/master/inst/doc/intro.html

ADD REPLYlink written 3.4 years ago by Lior Pachter330

Isn't kallisto/sleuth doing everything on the transcript level? Most biologists expect the results on gene level. That's the big caveat.

ADD REPLYlink written 3.4 years ago by igor8.1k

Using sleuth it is straightforward to examine quantification at the gene level. In a forthcoming release imminent there will be ability to perform differential analysis directly at the gene level as well as well.

ADD REPLYlink written 3.4 years ago by Lior Pachter330

Great! Do you have an estimate of when that will be available?

ADD REPLYlink written 3.4 years ago by igor8.1k
0
gravatar for BioRyder
3.4 years ago by
BioRyder160
India
BioRyder160 wrote:

Hello,

The below biostars post will help you to know about RNA seq and DE. A: Up-to-date Online RNA Sequence Analysis Training/Courses/Papers?

ADD COMMENTlink written 3.4 years ago by BioRyder160
0
gravatar for phil.chapman
3.4 years ago by
phil.chapman70
United Kingdom
phil.chapman70 wrote:

I would recommend reading the F1000R article below by Mike Love (author of DESeq2) and Simon Anders (DESeq) which is a detailed workflow for analysing RNAseq data written by two o fthe leaders in the field:

http://f1000research.com/articles/4-1070/v1

ADD COMMENTlink written 3.4 years ago by phil.chapman70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1069 users visited in the last hour