how to categorize genes by their expression levels in RNA-seq?
1
0
Entering edit mode
4.2 years ago
statfa ▴ 540

Hi,

How can we categorize genes by their expression levels? Is there some criteria on read counts?

Here is a similar question but my answer isn't there.

I wish to categorize genes like this:

low expression

medium expression

high expression

Thanks a lot

Gene Expression RNA-seq categorize • 858 views
ADD COMMENT
1
Entering edit mode
4.2 years ago

Depends on the reason for that categorization and statistic you want to use for your test. From statics point of view the best and easiest to explain is the situation when you have control data and you can transform the data using basic functions to normally distributed and then find mean and standard deviation, then decide, that for example, everything outside 2 standard deviations is low/high. Using QQplot at that point to remove outliers is a way to clean the data a bit. On that plot you may see a part of the distribution with different mean and standard deviation. This is usually due to noise and you can remove it or correct for it. In case of RNA-seq you probably have some sort of FPKM or similar measure. Log transform is one thing to try. At least this is what I try to do first with RNA-seq data. Sometimes this can not be done.

ADD COMMENT
0
Entering edit mode

Thanks for the info. I thought maybe there was some predetermined criteria. For example genes with counts less than 5 or 10 are considered lowly expressed according to EdgeR manual. I wished to know the criteria for other levels but it seems I have to calculate it using control data but I don't have any.

The reason why I'm looking for these levels is that I want to examine which of my two DEG detection models is doing better detecting Highly expressed genes, medium expressed genes and lowly expressed genes as DE.

Thank you

ADD REPLY
1
Entering edit mode

I see you do not have any controls for normalization. Another way around is to use a subset of stably expressed genes between different samples under different conditions. Usually, these are some of housekeeping genes. Normalize your data based on them.

ADD REPLY
0
Entering edit mode

Oh ok. Thanks a lot. I haven't done it before but I try to see if I can handle it.

ADD REPLY

Login before adding your answer.

Traffic: 1624 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6