Question: When performing Fisher's exact test, should I consider all annotated or all detected transcripts?
gravatar for shawn.w.foley
4 months ago by
shawn.w.foley70 wrote:

I've performed RNA-seq on 30 cell lines, and am trying to determine if there is an enrichment in oncogenes in genes that are highly expressed (>50 rpkm) across >15 cell lines. Of the ~20,000 annotated mRNAs, there are only ~10,000 mRNAs that are expressed in at least one cell line (rpkm > 1). When I perform my fisher's test, I will be generating a 2x2 matrix comparing highly expressed genes, oncogenes, and all detectable genes.

My question is: should I only consider detectable genes (and detectable oncogenes) when I perform my Fisher's test, or if I should consider all annotated genes?

I'm think I should only consider genes that are detectable in one or more cell lines, and subset the list of oncogenes accordingly. It would be unfair to look for an enrichment among the 20,000 annotated genes when only half of them are actually being expressed, or am I overthinking this problem?

Thank you!

ADD COMMENTlink modified 4 months ago by Hussain Ather820 • written 4 months ago by shawn.w.foley70
gravatar for Hussain Ather
4 months ago by
Hussain Ather820
National Institutes of Health, Bethesda, MD
Hussain Ather820 wrote:

You'd want to use the detectable genes. You can read more about Fisher's exact test in RNA-Seq in this paper. I think Table 1 might help.

ADD COMMENTlink written 4 months ago by Hussain Ather820

This looks perfect, thank you!

ADD REPLYlink written 4 months ago by shawn.w.foley70
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1106 users visited in the last hour