Question: When performing Fisher's exact test, should I consider all annotated or all detected transcripts?
gravatar for shawn.w.foley
11 months ago by
shawn.w.foley130 wrote:

I've performed RNA-seq on 30 cell lines, and am trying to determine if there is an enrichment in oncogenes in genes that are highly expressed (>50 rpkm) across >15 cell lines. Of the ~20,000 annotated mRNAs, there are only ~10,000 mRNAs that are expressed in at least one cell line (rpkm > 1). When I perform my fisher's test, I will be generating a 2x2 matrix comparing highly expressed genes, oncogenes, and all detectable genes.

My question is: should I only consider detectable genes (and detectable oncogenes) when I perform my Fisher's test, or if I should consider all annotated genes?

I'm think I should only consider genes that are detectable in one or more cell lines, and subset the list of oncogenes accordingly. It would be unfair to look for an enrichment among the 20,000 annotated genes when only half of them are actually being expressed, or am I overthinking this problem?

Thank you!

ADD COMMENTlink modified 11 months ago by Hussain Ather880 • written 11 months ago by shawn.w.foley130
gravatar for Hussain Ather
11 months ago by
Hussain Ather880
National Institutes of Health, Bethesda, MD
Hussain Ather880 wrote:

You'd want to use the detectable genes. You can read more about Fisher's exact test in RNA-Seq in this paper. I think Table 1 might help.

ADD COMMENTlink written 11 months ago by Hussain Ather880

This looks perfect, thank you!

ADD REPLYlink written 11 months ago by shawn.w.foley130
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1941 users visited in the last hour