Question: How do I use quantile normalized data to do differential expression analysis in RStudio?
gravatar for maricastanon
8 months ago by
maricastanon0 wrote:

Good Afternoon,

I am very new at RStudio and differential expression analysis and I am a bit lost about what to do.

I received 2 tables of gene expression on dataframe One with the studied cells, other with control group. The data doesn't come on rawcounts, but in quantile normalized data (with RMA). I tried but couldn't create a DESeq2 object and looks like I can only use rawcounts for it. I can't have access to the rawcounts and those 2 tables were all I got.

The authors say:

The .CEL files of control group were downloaded and data were normalized using the GeneSpring GX software (version 11, Agilent Technologies) and the Robust Multi-array Average (RMA) algorithm for summarization (which uses Perfect Match (PM)-only based measures) and quantile normalization, and then exported.

The "value" output == RMA normalized intensity values (no baseline transformation). I have, however, a *.csv file.

What should I first do with it? Could you help me with which steps and which package I can use? Now I've been checking limma, but I am having problems to find out how I can create the object for the package.

Best regards, Mari

rna-seq R gene • 346 views
ADD COMMENTlink modified 8 months ago by zx87549.9k • written 8 months ago by maricastanon0

If it's all normalized I would first make sure that the two sets are normalized together (most genes have the same values) and then just use lm to test each gene.

ADD REPLYlink written 8 months ago by Asaf8.5k
gravatar for ATpoint
8 months ago by
ATpoint44k wrote:

CEL files are the raw data that microarrays produce (from the company Agilent afaik). The RMA preprocessing is standard, so this is fine. Still, it is expected that all data that go into the same analysis have been processed together as the quantile normalization step is not an independent procedure per array but depends on which samples have been included or not. See this video for details on QN.

So what you have to do nowis to find why two tables exist. have they been processed together and then splitted? How many replicates are there? You need replicates for differential analysis. In the best case you have replicates and files have been processed together, so you would be good to go with the standard software called limma which has been developed for microarray differential analysis using a linear model-based frakework.

ADD COMMENTlink written 8 months ago by ATpoint44k

Just to follow ATpoint's suggestions up with a couple of hopefully helpful links about microarray data analysis with R:

In short, limma is a package that provides functions to perform slightly more sophisticated tests than simple t-tests to determine whether a given gene is differentially expressed

ADD REPLYlink written 8 months ago by Friederike6.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1486 users visited in the last hour