Question

rsem matrix to DESeq2

1

Entering edit mode

5.9 years ago

Folder40g ▴ 190

I know that I can pass the gene-level rsem matrix to limma and work with it.

But can I use it in DESeq?

I've a rsem matrix from firebrowse, gene-level ~20000 rows. DESeq requires raw counts, and as I only have the matrix with samples/gene I can't use tximport().

Thanks

rsem • 9.7k views

ADD COMMENT • link updated 3.0 years ago by Kevin Blighe 87k • written 5.9 years ago by Folder40g ▴ 190

score 3 · Answer 1 · 2018-06-13

3

Entering edit mode

5.9 years ago

Kevin Blighe 87k

You can use these with DESeq2 if you just round the numbers to whole integers and then input to DESeq2 with DESeqDataSetFromMatrix(). It is not ideal and using tximport would be preferred, as it does some adjustments for transcript length and transcript isoform abundances.

If you don't take my word, then take that of the DESeq2 developer: DESeq2 Following RSEM

Kevin

ADD COMMENT • link 3.0 years ago by Kevin Blighe 87k

0

Entering edit mode

Hello, I used an RSEM, gene-level count estimates matrix and just rounded the values using the round() function in R to feed them to DESeq2 but the results are very odd (VERY low number of deferentially expressed genes). Is this really a solid way to go about deferential analysis or should I resort to starting from raw data again.

NOTE: The study I got the data from provided the data as RSEM gene-level count matrix and FPKM normalized matrix. I had to use rsem since I know DESeq2 only takes non-normalized counts. The study used an independent t-test on the FPKM file to do the analysis, but I read somewhere that it is highly discouraged.

ADD REPLY • link 3.1 years ago by mehdim • 0

1

Entering edit mode

Using a t-test on FPKM data is an improper analysis. In fact, no statistical inferences in the realm of differential expression analysis can be performed using FPKM data.

It may be the genuine result that there are no differentially expressed genes in your data. How does it appear the dispersion plot?; the MA plot(s)?; the volcano plot(s)? Are the sample groups imbalanced (e.g. 3 normal versus 20 disease)?

ADD REPLY • link 3.1 years ago by Kevin Blighe 87k

0

Entering edit mode

Hello! Thank you for your timely reply. The differential expression is between 3 grades of glioma 2, 3 and 4. Basically paired comparisons using DESeq2 The plots are the following: Volcano Disp

The distribution of the grades is fairly equal between the conditions (grades)

ADD REPLY • link 3.1 years ago by mehdim • 0

1

Entering edit mode

These plots look okay... the dispersion trend seems a little strange (the large 'belly' on the bottom) but perhaps that's due to RSEM. For the volcano, we may usually use the unadjusted p-values.

For all intents and purposes, the result that you have may be genuine. There may be another confounding factor for which you need to control, but I do not know your experiment 100% to know what that factor may be.

ADD REPLY • link 3.1 years ago by Kevin Blighe 87k