Question: During GO / pathway enrichment analysis, should we exclude genes not expressed in both groups?
0
gravatar for CY
23 months ago by
CY460
United States
CY460 wrote:

We found DE genes and then performed GO / pathway enrichment analysis (fisher exact test is used, right?)

What we are doing now is based on these two ratio: 1: number of genes in specific GO / total number of genes 2: number of DE genes in GO / total number of DE genes

Someone suggest that we should exclude genes that not express in both groups.

1: total number of genes -> total number of genes - genes that not express in both groups 2: total number of DE genes -> total number of DE genes - genes that not express in both groups

This suggestion is also kind of make sense considering genes in DE list has to be expressed in at least one group.

Can anyone share some comments? Thanks

ADD COMMENTlink written 23 months ago by CY460

Look at here

http://lrpath.ncibi.org/

might be helpful

ADD REPLYlink written 23 months ago by A3.7k

I didn't see anything answering my question in your link. Can you share some comments?

ADD REPLYlink written 23 months ago by CY460

I suggested this because GO and pathway analysis could be done regarding differential expression. For instance you provide raw read counts in RNA-seq and the program gives you GO and pathways

ADD REPLYlink written 23 months ago by A3.7k

Yes. My question is, when counting the raw read count, should I subtract the genes that were not expressed in both group?

ADD REPLYlink written 23 months ago by CY460

if you mean extremely low expressed genes or genes that are all zero for all samples, I used to removing these genes beforehand. If not, sorry I don't know

ADD REPLYlink modified 23 months ago • written 23 months ago by A3.7k

Yes, that is what I am asking. I used to not exclude genes that don't express at all. Guess that was wrong

ADD REPLYlink written 23 months ago by CY460
2

Yes, just remove them, but this is done at the raw count stage, for example, removing all transcripts (genes) whose mean raw count is <10. Genes with high numbers of NAs can also be filtered out. Filtering prior to normalisation and differential expression analysis can vary from study to study.

Then, by the time that you reach the gene enrichment stage, you can have high confidence that the genes that you have included are by default expressed in both groups.

ADD REPLYlink written 23 months ago by Kevin Blighe54k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 798 users visited in the last hour