Question: Ribosomal proteins differentially expressed?
gravatar for aggregatibacter
3.9 years ago by
Bonn, Germany
aggregatibacter140 wrote:


I have data from human tissue biopsies with different diseases. My pipeline was fastqc - trimmomatic - fastqc - star - featurecounts - voom/limma. I removed rRNAs during library prep, and from the Ensembl annotation gtf during the featurecounts step.

Now, I am puzzled to get quite a lot of ribosomal proteins as differentially expressed. I understand that these are not rRNAs, but I have never seen so many in one place.

Should I be worried, or could this be normal biology?


rna-seq • 2.6k views
ADD COMMENTlink written 3.9 years ago by aggregatibacter140

Protein synthesis is often altered during stress. I think it has biological meaning and is not a technical artifact (I have seen it a lot in expression studies).

ADD REPLYlink written 3.9 years ago by Benn8.0k

Thanks for your quick reply. I also found some papers that describe functions in inflammation etc. for these transcripts. They only make up roughly a third of my list, so I was wondering...

ADD REPLYlink written 3.9 years ago by aggregatibacter140

It might be a result of bad normalization of the counts. If it's biologically reasonable that there are different number of ribosomes in one condition over the other then it should be valid, otherwise you might see this change while they didn't change because some other proteins changed expression level in the opposite direction and the normalization process missed it. I recommend you to run it with DESeq2 and see if this is reproducible with their normalization.

ADD REPLYlink written 3.9 years ago by Asaf8.2k

Did you perform any size-factor normalization? It's certainly biologically possible, but in the steps above you don't mention any steps that would adjust for sample size differences. This would also be an easy way in which the differences could appear.

ADD REPLYlink written 3.9 years ago by keith.hughitt280

I used edgeR to introduce the different library sizes (without an extra argument, so it should take the total counts). See code below. Alternatively, I get relatively large (judging from my experience with limma and arrays) weights (some >2 fold). Could this be a problem, too?

x <- DGEList(counts = counts$counts, genes=counts$annotation)

isexpr <- rowSums(cpm(x) > 75) >= 25

x <- x[isexpr,]

design <- model.matrix(~1+ targets$var1 + targets$var2)

y <- voomWithQualityWeights(x,design,plot=F, normalize="quantile")

Many thanks!

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by aggregatibacter140
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1109 users visited in the last hour