Limma Differential Analysis on Proteomics data
1
0
Entering edit mode
6 months ago

Hi, I have a proteomics data set and I am doing the differential analysis on that. I used the Limma package to do that. I first removed the negative counts and did the analysis but I am getting all upregulated ones and none of the ones are down .

groups<-interaction(final_val_C144$Qp_Group,final_val_C144$Day)
design = model.matrix(~0+groups)

colnames(design) = gsub("groups","",colnames(design))

d0 <- DGEList(proteomeRaw_c144)
d0 <- calcNormFactors(d0)
y<-voom(d0,design,plot=T)
fit <- lmFit(y, design)
fit1 <- eBayes(fit)

top.table <- topTable(fit1, sort.by = "F", n = Inf)


On running the below code, I get the following output

summary(decideTests(fit1))

Down         0      0      0     0      0     0      0     0
NotSig       0      0      0     0      0     0      0     0
Up        8106   8106   8106  8106   8106  8106   8106  8106


Any help .

Limma Differential-Expression R proteomics • 1.1k views
0
Entering edit mode

Cross-posted to Bioconductor https://support.bioconductor.org/p/9142067/

0
Entering edit mode

@OP, can you please stop crossposting to that extend? It splits information across multiple communities and doubles the effort for users. Please consider to decide for one community and then wait whether you get answers in a reasonable timespan (a few hours is not "reasonable", rather days).

0
Entering edit mode
6 months ago
Gordon Smyth ★ 4.8k

The crazy DE result is because you haven't formed any contrasts to compare any of the groups. If you form a contrast then things will be normal again.

But I wonder about the nature of the data. Proteomics data shouldn't have any negative values and how did you remove them? voom() is only for sequencing data, not for proteomics. Are you analysing spectral counts? It would be better to convert to logs and use limma-trend.

0
Entering edit mode

Gordon Smyth Thanks for the reply. I removed all the rows that contained even one negative value

proteome_raw<-proteome_raw[rowSums(proteome_raw<0)<1,]

I am from a Data Science background, and I don't know this area. I am new to differential analysis, and I am just trying to do that on Proteomics data. At first, I used the package DEQMS, as I read it provides better results on Proteomics data than Limma, but I ran into issues with the spectraCounteBayes function as I didn't have the PSM count data. Someone in my team told me that the PSM count method is old and shouldn't be used now. I don't have anything to compare, that's why I didn't use contrasts.

https://www.bioconductor.org/packages/release/bioc/vignettes/DEqMS/inst/doc/DEqMS-package-vignette.html#deqms-analysis

1
Entering edit mode
• Proteomics technologies do not give negative expression values so it seems that there has already been some incorrect preprocessing done of your data. So it doesn't seem to be truly raw data.
• What you do think a differential expression analysis is? The whole purpose of such an analysis is to find proteins that change in expression between two conditions. So you always have to make a comparison between two conditions. As I pointed out in response to your previous post, you have different 8 conditions that need to be compared: Error in DEqMS proteomics analysis. You need to decide which condition you wish to compare to which. That's how science works ... you have to be clear about the hypotheses you wish to test.
0
Entering edit mode

Hi Gordon Smyth, as our lab has been testing out DEqMS recently I would be very interested in hearing your opinion on it if you have a minute. Can you elaborate on why you think its variance/PSM approach does not improve over standalone limma?

0
Entering edit mode

In this case OP doesn't have the data required for input to DEqMS. I think that allowing the variance to depend on the number of PSMs when that data is available is a good idea, but limma can already do that as part of its native code. For example you could use:

fit <- lmFit(y, design)
fit$Amean <- log(PSM) fit <- eBayes(fit, trend=TRUE) fit$Amean <- rowMeans(y)


where PSM is the PSM count and then you would already have the same variance/PSM approach that DEqMS proposes without the need for an extra package and an extra layer of analysis. In the devel version of limma we've made it one step easier again:

fit <- lmFit(y, design)
fit <- eBayes(fit, trend=log(PSM))


The use of native limma code further allows the PSM variance trend to be combined with robust empirical Bayes, something that the DEqMS code does not allow.

0
Entering edit mode

I did use the contrast for comparison but the results are still not correct as all of them are notSig. Do you think the issue is with the data? I will get back to the team and ask them to recollect the data.

d0 <- DGEList(proteomeRaw_c144)
d0 <- calcNormFactors(d0)

y<-voom(d0,design,plot=T)
fit <- lmFit(y, design)
contrast =  makeContrasts(high11vslow11=high.11-low.11,
high5vslow5=high.5-low.5,
levels = colnames(coef(fit)))

fit.cont <- contrasts.fit(fit, contrast)

fit1 <- eBayes(fit.cont)

summary(decideTests(fit1))

> summary(decideTests(fit1))
high11vslow11 high5vslow5
Down               0           0
NotSig          8106        8106
Up                 0           0


`
1
Entering edit mode

voom() is only for sequencing data, not for proteomics

0
Entering edit mode

I came across Gordon Smyth answer in another post

https://support.bioconductor.org/p/64484/#64554

0
Entering edit mode

My answer you've linked to from 7 years ago says the same as I am tellng you here, which is that instead of using voom it would be simpler and better to just log the counts, which you were already doing yourself a week ago (Error in DEqMS proteomics analysis). I have said the same thing consistently, for example

We have never at any time recommended TMM normalization or voom for proteomics data. voom is specifically designed to mean-variance trends in the presence of differing sequencing depths between samples, but the concept of sequencing depth doesn't apply to proteomics data. It is not that voom will give bad results, just that is unnecessary.

0
Entering edit mode

Who says that is an incorrect result? It is perfectly possible that there are no significant differences beween the groups you have compared.

As before, we have never recommended TMM normalization or voom for proteomics, although they should still give ok results on spectral counts, so I doubt that is the most pressing problem with your data or the cause of there being no significant DE.