Question: Add RPKM values together for multiple different genes?
0
gravatar for tpham2654
6 months ago by
tpham26540
tpham26540 wrote:

Is it OK to add RPKM values for multiple different genes together? The genes I want to add together are all subunits of the same protein (NF-κB) and I have different RPKM values for the different components I am interested in (REL, RELA, RELB, NFKB1, and NFKB2) as genes, NOT transcripts, separately.

I'm aware that you can use GSEA, but it seems kind of silly to do so for a gene set of 5 genes total.

gsea rna-seq rpkm • 300 views
ADD COMMENTlink modified 6 months ago by Michael Dondrup46k • written 6 months ago by tpham26540
1

You can't simply sum RPKMs. Before writing any more about that, though, please describe further what you actually want to do with such a value. Please note that RPKMs have very limited utility.

ADD REPLYlink written 6 months ago by Devon Ryan90k

I am comparing two groups of mouse samples (3 per group, one group has a gene knocked out the other group does not) and seeing how the expression of NFKB changes due to the knockout.

I figured I could not just add RPKM's since I saw that in other questions online where people were talking about different transcripts. So what should I do?

ADD REPLYlink written 6 months ago by tpham26540
1

Why don't you check if the genes that encode the subunits are differentially expressed with an appropriate framework like DESeq2?

ADD REPLYlink modified 6 months ago • written 6 months ago by ATpoint16k

I have differential analysis data for all the genes in my sample. The thing is I was requested to summarize the expression by somehow condensing all the NFKB data into 1 bar per group on a bar graph.

ADD REPLYlink written 6 months ago by tpham26540
2

Your job isn't to give people what they ask for, it's to give them what they actually need whether they know it or not.

ADD REPLYlink written 6 months ago by Devon Ryan90k
1

To add an explanation to this: To condense gene expression profiles of duplicated genes into a single value and present only this is inadequate, because duplication can lead to neofunctionalization and functional diversification and can also lead to diversification of gene regulation (e.g. Kleinjan et al. 2008).

NF-κB is a protein complex, expression and regulation of the components of protein complexes can also vary widely, and it is questionable if a mean or median expression over all genes is meaningful. I would rather look at pattern of co-expression e.g. by correlation and subsequent network analysis.

ADD REPLYlink modified 6 months ago • written 6 months ago by Michael Dondrup46k
1

Using bar-plots to summarize data is a big no go since this can obscure the data - especially when you only have 3 replicates showing the raw data is essential for transparency.

You could do a point plot instead (same layout as a bar plot - you just have 3 points above one another instead of a bar)

ADD REPLYlink written 6 months ago by kristoffer.vittingseerup1.8k

Then better make two boxplots or beeswarm plot with labelled data points for the subunits.

ADD REPLYlink modified 6 months ago • written 6 months ago by ATpoint16k

I suggest you use GSEA, even for 5 genes, since that won't mask a lack of or discordant change in the components.

ADD REPLYlink written 6 months ago by Devon Ryan90k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 673 users visited in the last hour