Is it possible to do DGE analysis using log 2 normalized data with EdgeR ?
1
0
Entering edit mode
13 months ago
Adaline.D • 0

Hello everyone,

I intend to analyze differential gene expression using a GEO dataset. The value is log2 normalized signal intensity.

I know edgeR's workflow involves log normalization. However, can I skip the normalization steps and continue the rest of the analysis (estimate the BCV(s) and make pairwise comparisons)? Or do I need to analyze from scratch (raw data)?

I have read the user's guide and searched online, but I did not find an answer to my question.

Thank you very much for any help you can provide. Have a nice day!

analysis DGE • 1.0k views
ADD COMMENT
0
Entering edit mode

What technology is the GEO data set derived from? Signal intensity implies a fluorescence read out (i.e. microarray). If that's the case, you can't use edgeR. If the data set is derived from sequencing and involves read counts, edgeR might be appropriate if you can get read counts on features. (all the stuff that LChart said).

ADD REPLY
0
Entering edit mode

That explains a lot to me. Thank you, seidel! They used Affymetrix Human Gene 1.0 ST Array chips.

Hope you have a wonderful day.

ADD REPLY
2
Entering edit mode
13 months ago
LChart 3.9k

Short answer: use LIMMA

Long answer: edgeR uses a negative binomial model, and expects counts and not values. Please read section 2.8.6 of the edgeR vignette here: https://www.bioconductor.org/packages/devel/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf

If all of the below are true:

(1) You want to recapitulate a publication's results

(2) The publication used edgeR

(3) The GEO data for the publication is log2-normalized expression estimates

then you cannot accomplish this task without going to the raw SRA data and obtaining counts.

However, if you are comfortable with a (possibly substantial) loss of power, you can use the published log2 CPM/TPM values in conjunction with LIMMA.

ADD COMMENT
0
Entering edit mode

I appreciate your help, LChart! I will use raw data in case of a substantial loss of power.

(1) The publication did a PCA and then analyzed the DEG within each group. Nevertheless, I want to see the overall DEG without grouping.

(2) They use a modified version of the sigpathway algorithm (I don't know what that is :p). EdgeR is the package that I use a lot.

(3) Sorry, I don't quite understand this point, but I figure that the log2-normalized data is what they use for the PCA analysis.

Again, thank you very much, and have a nice day.

ADD REPLY

Login before adding your answer.

Traffic: 2056 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6