Question: Troubleshooting differential expression with limma
0
gravatar for deconti.d
4.0 years ago by
deconti.d0
United States
deconti.d0 wrote:

Hi.

I am having some troubleĀ  with the output of my differential gene expression analysis with limma. The problem I run into is that the results of my analysis is that all probes are statistically significant with both P-value and adjusted P-value.

I have been given an RMA normalized data set. The file i was provided was set up as a csv file. An example is as follows:

            sample1group1   sample2group1   sample3group2   sample4group2
probeid_1   7.8165          6.7145          3.142           2.495
probeid_2   6.4586          4.2135          1.245           5.325
probeid_3   4.5241          4.2111          4.456           7.415
...

So, I chose to use the Bioconductor limma package to perform differential analysis between group 1 and group 2 of samples. Following the manual, I set up the following R script:

library(limma)
df = read.table(above_table_filename, sep="\t", header=TRUE)
design = c(0, 0, 1, 1) # set 0 for group1 and 1 for group2
row.names(df) <- df$X # set row names
df <- df[-1] # remove the first column now that row.names has been set
df <- df[-(1:n),] # remove the control probes
num_rows <- nrow(df)
fit <- lmFit(df, design) # initialize the limma fit
fit <- eBayes(fit) # recommended Bayesian fit
options(digits=3) # set sig figs
# Provides table of gene ids sorted by P-value
topTable(fit, adjust="fdr", sort.by="P", number=num_rows)

Did I do something wrong in the set up of this analysis? The manual had their example of a design matrix with -1 and 1, but I do not know if that would make a big difference. Otherwise, I am not sure what I set up incorrectly with the data. I have been given a number of groups to test, and all groups result in the same highly significant values for all probes. So, it is not something particular to this grouping.

Edit: Just to be clear, I want to find the most highly differentially expressed genes between the two groups of samples. So whatever genes (on average or median) b/n group 1 are higher/lower than group 2.

Edit2: I should mention that my columns in the actual file are not neatly organized as above. The groups are mixed randomly between the columns, so if 0 = group1 and 1 = group2, then I would have a vector like (as an example):

c(1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0)

Any good way to adjust for this when making the design table, if I made it incorrectly.

ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by deconti.d0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 759 users visited in the last hour