Question: How to make differential expression analysis with normalized data?
0
gravatar for LuisNagano
6 weeks ago by
LuisNagano20
University of Campinas
LuisNagano20 wrote:

Hello, could anyone help me out? Is there any R package that runs differential expression analysis or statistical test like generating log2FC and adj-p values from normalized RNA-seq and Array expression values? The available data that I need to analyze is in FPKM, a table with ~50000 genes, I don't have access to raw data.

Thank you very much!

rna-seq deg fpkm • 155 views
ADD COMMENTlink modified 6 weeks ago by ATpoint24k • written 6 weeks ago by LuisNagano20

which tool is used for quantification? RSEM, stringtie, cufflink? just try tximport package from deseq2 team

ADD REPLYlink written 6 weeks ago by boaty90

The authors don't cite the tool used for normalization. DEseq2 only works with genes raw counts, doesn't it? I have normalized data in FPKM. I want a package for analyse any normalized expression data, like MAS5, RSEM, FPKM, TPM...

ADD REPLYlink written 6 weeks ago by LuisNagano20
1

This has been discussed before extensively and repetitively, please use the search function. Start from this one: https://support.bioconductor.org/p/102551/ and from there please google around. You'll find pretty much the same answer that the limma-based strategy suggested there is probably the best possible but still bad solution to what you aim to do, as FPKM is not suited for differential analysis. Further details on why that is can be found in numerous threads here, on BioC and the web.

ADD REPLYlink written 6 weeks ago by ATpoint24k

look at this first tximport tximport will take normalised count and length information to recompute raw count...... you can use tximport to get pseudo raw count then use deseq2 for GDE It works for almost all the modern count quantification tools like kallisto, stringtie. But you need to know which tool is used for gene quantification

ADD REPLYlink modified 6 weeks ago • written 6 weeks ago by boaty90
1

No, that is not true and not recommended. tximport aggregates transcript abundance estimates to the gene level and corrects for average transcript length, it does not do any magic to save you from inferior normalization techniques like FPKM. The transcript information is already lost in FPKM as in most cases this is already the gene level count, therefore tximport would be meaningless. If possible, download the raw data from NCBI or ENA and obtain raw counts. Everything else is inferior. Relying on prenormalized counts where (as OP states) the method section lacks details about the pipeline is not reproducible and therefore IMHO not recommended, beyond the issue that FPKM is a poor choice for normalization.

ADD REPLYlink modified 6 weeks ago • written 6 weeks ago by ATpoint24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 749 users visited in the last hour