Question: limma voom problem
0
gravatar for iskysinger
8 months ago by
iskysinger0
iskysinger0 wrote:

Hi, I was trying to use R limma package for the RNAseq DE analysis. I created a design (or model) matrix using model.matrix, and the design matrix is like this:

> design
   f1LPL_CD3_minus f1LPL_CD3_plus       RIN. Ribosomal.content Yield..ug. Exon.mapping.rate f3colon f3ileum
97               1              0  1.4537394       -0.25587959  0.1703515        1.52249759       1       0
98               1              0 -0.5372515        0.55320491 -0.7125984       -0.07822664       1       0
91               0              1  1.1693121       -0.16896496 -0.4363936        0.13894200       0       0
99               1              0 -1.3905334       -0.01870335 -0.5903438        0.60849125       1       0
92               0              1  0.6004576       -0.50248289  2.1535927        0.25607977       0       1
93               0              1 -0.2528242       -0.43028018 -0.6790916       -0.24759116       1       0
94               0              1 -1.1061061       -0.87528023  1.0940529        0.63935740       1       0
95               1              0  0.6004576        2.42297639 -0.2462197       -2.01941516       0       1
96               1              0 -0.5372515       -0.72459011 -0.7533499       -0.82013504       1       0

However, when I estimate the mean-variance relationship and use this to compute appropriate observational-level weights using voom, I get the following message:

Warning messages:

1: Partial NA coefficients for 14769 probe(s)

2: In voom(gExpr, design) :

The experimental design has no replication. Setting weights to 1.

Does anyone knows why there experimental design has no replication. Any help will be appreciated. Thanks a lot.

Andy

voom rna-seq limma next-gen R • 811 views
ADD COMMENTlink modified 8 months ago by Gordon Smyth1.9k • written 8 months ago by iskysinger0

Which of these columns indicate the experimental groups? You do not plan to use all these parameters for your model, right?

ADD REPLYlink modified 8 months ago • written 8 months ago by ATpoint39k

Hi , Thanks for the reply. The f1LPL_CD3_minus and f1LPL_CD3_plus indicate the experimental groups and used for the comparison. The rest parameters are used as co-variants for the model.

ADD REPLYlink written 8 months ago by iskysinger0

added limma and voom tags

ADD REPLYlink written 8 months ago by Kevin Blighe65k
2
gravatar for Gordon Smyth
8 months ago by
Gordon Smyth1.9k
Australia
Gordon Smyth1.9k wrote:

Actually, it is not possible to get the message you show if you entered a valid count matrix to voom with the design matrix that you show. There are only two possibilities. Either

  1. Your count matrix contains NA values (which are not allowed, see the voom help page) or
  2. The design matrix entered to voom actually has lots more columns than the one shown.

In general, the design matrix needs to have fewer columns than rows, preferably much fewer columns than rows. The matrix should only include columns that are a meaningful part of the experimental design rather than every quality metric that you might have collected on the samples (like RIN, yield or ribosomal context). If there are no rows left over to estimate variability, then voom will set all the weights to 1 with the message that you see.

More precisely, it is the number of linearly independent columns that must to be less than the number of rows. If extra redundant columns are included, then limma will identify and remove them automatically. This process will result in the "Partial NA coefficients" message that you see.

There are two reasons why the design matrix you show should not yield the above messages. First, your design matrix appears to have 9 rows and 8 columns, i.e., there is one 1 df left over to estimate the variance. Secondly, all the columns of your design matrix are linearly independent, so you could not get the message about NA coefficients.

On the other hand, if your count matrix was all NA, then voom would output exactly the message shown because then the data would then have no rows of valid data. I suggest that you recheck the data you are entering to voom.

I also strongly suggest that you reconsider your statistical approach. Even when you get your data entry problems fixed, I don't believe it is statistically defensible to include so many incidental predictors in a linear model with only 9 samples.

ADD COMMENTlink modified 8 months ago • written 8 months ago by Gordon Smyth1.9k

Hi Gordon, Thanks for the comments. I checked the limma manu, but didn't find why the columns should be much fewer than rows. Would you please tell me the reason behind that? The reason I used ( RIN, yield or ribosomal etc.) is that I found those parameters are contribute to the gene expression by using variancePartition.

ADD REPLYlink written 8 months ago by iskysinger0
1

It is a universal statistical principle, not specific to limma, that you can't estimate more parameters than you have data points. If you fit a linear regression with as many coefficients as data points, then all the residuals will be zero regardless of the unknown variance of the data. How could you estimate variances from residuals that are all zero?

If you can't estimate all the predictors simultaneously in limma, then you can't do it in variancePartition either. I don't believe it is possible for you to have determined that RIN, yield, ribosomal contribute significantly to gene expression. Either you have not adjusted for all the variables simultaneously or variancePartition is outputting something weird. variancePartition is designed for multilevel experiments, but your design does not appear to have random effects or multiple levels of variation, so it is not clear to me how a variancePartition analysis could be relevant.

ADD REPLYlink modified 8 months ago • written 8 months ago by Gordon Smyth1.9k

I had a closer look at your design matrix and realized that there are other issues with your use of voom of even more immediate concern that the number of columns in the design matrix. I have edited my answer above to explain.

ADD REPLYlink modified 8 months ago • written 8 months ago by Gordon Smyth1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 999 users visited in the last hour