Question: normalized FPKM matrix file EdgeR and GO Mapping
gravatar for Biogeek
6.2 years ago by
Biogeek400 wrote:

Can someone provide me with a clear cut answer to the below:

Can I use the normalised .FPKM matrix file which feeds into EdgeR from RSEM abundance counts to annotate the top X amount of sequences unregulated/down-regulated between 2 conditions in Blast2goPro

I am reading FPKM values are unreliable....? What I am doing currently is working with the above mentioned normalised FPKM matrix file via Excel to filter out sequences present between conditions to try and group the sequences being up-regulated /down-regulated in respect to their GO mapping and annotations in Blast2go Pro. I work out fold change, then Log2 value. Would this be bullshit? Whilst I have wonderful heat maps generated by EdgeR to show diff. expression can I be confident in using the same FPKM values used to generate said heat maps for annotation analysis? Someone with a bit of knowledge in regards to this please help me.

I have used Trinity/ RSEM/EdgeR pipeline. Alternative methods are welcome but I'm needing somewhat speedy replies. 

Any help appreciated.Thanks.

ADD COMMENTlink modified 6.2 years ago by Charles Warden7.8k • written 6.2 years ago by Biogeek400

Are you just using edgeR for heatmaps (in which case, why bother, just use heatmap.2) or are you trying to use it for statistics too?

ADD REPLYlink modified 6.2 years ago • written 6.2 years ago by Devon Ryan96k

I am using EdgeR for heat maps, but I also want statistics from it. I want to find the top 1000 upregulated genes between 2 conditions, then once I find these, I want to feed the sequences into Blast2GOPro via the Trinity .fasta file I already have with the sequences in it.

ADD REPLYlink written 6.2 years ago by Biogeek400

I posted a new comment while you were writing this reply, so I'll just refer to it below.

ADD REPLYlink written 6.2 years ago by Devon Ryan96k

or would it make more sense to use the files?

ADD REPLYlink written 6.2 years ago by Biogeek400

BTW, regarding using edgeR (or limma/voom or DESeq(2)) with FPKM/RPKM values, I'll just link to one of Gordon Smyth's many replies on the subject from the bioconductor email list.

ADD REPLYlink written 6.2 years ago by Devon Ryan96k
gravatar for Charles Warden
6.2 years ago by
Charles Warden7.8k
Duarte, CA
Charles Warden7.8k wrote:

edgeR needs raw counts, not FPKM values. In general, RPKM/FPKM values are relatively standard practice: there may be factors that influence them, but I would consider that to be kind of like a batch effect. I think there will always be some confounding factors between experiments (such as different sample preparation, etc.). However, I don't think this is the most important issue in your specific context:

  1. edgeR will give funky results sometimes (which is why I don't use edgeR). If you are trying to compare the count-based edgeR analysis to an analysis based upon FPKM (or just the fold-change values calculated by FPKM), then you probably will notice differences. However, I these are most likely the fault of the edgeR calculation rather then the FPKM calculation. For example, consider the fold-changes for this gene:

    FYI, this is referenced from this blog post:

  2. You say you are using Trinity/RSEM/edgeR followed by some sort of GO analysis. I am a bit confused because Trinity is for de novo assembly, but I would always recommend a direct alignment when possible. I would use de novo assembly when a reference genome hasn't be characterized, but I would imagine such a genome would be unlikely to have GO terms. In other words, my general answer is that GO enrichment following analysis of FPKM expression values should be OK (I do this routinely), but it sounds like your strategy may have other issues that would cause GO enrichment to be problematic.

ADD COMMENTlink modified 6 months ago by RamRS27k • written 6.2 years ago by Charles Warden7.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1136 users visited in the last hour