Question: When performing Fisher's exact test, should I consider all annotated or all detected transcripts?
gravatar for shawn.w.foley
14 months ago by
shawn.w.foley180 wrote:

I've performed RNA-seq on 30 cell lines, and am trying to determine if there is an enrichment in oncogenes in genes that are highly expressed (>50 rpkm) across >15 cell lines. Of the ~20,000 annotated mRNAs, there are only ~10,000 mRNAs that are expressed in at least one cell line (rpkm > 1). When I perform my fisher's test, I will be generating a 2x2 matrix comparing highly expressed genes, oncogenes, and all detectable genes.

My question is: should I only consider detectable genes (and detectable oncogenes) when I perform my Fisher's test, or if I should consider all annotated genes?

I'm think I should only consider genes that are detectable in one or more cell lines, and subset the list of oncogenes accordingly. It would be unfair to look for an enrichment among the 20,000 annotated genes when only half of them are actually being expressed, or am I overthinking this problem?

Thank you!

ADD COMMENTlink modified 14 months ago by Hussain Ather890 • written 14 months ago by shawn.w.foley180
gravatar for Hussain Ather
14 months ago by
Hussain Ather890
National Institutes of Health, Bethesda, MD
Hussain Ather890 wrote:

You'd want to use the detectable genes. You can read more about Fisher's exact test in RNA-Seq in this paper. I think Table 1 might help.

ADD COMMENTlink written 14 months ago by Hussain Ather890

This looks perfect, thank you!

ADD REPLYlink written 14 months ago by shawn.w.foley180
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 875 users visited in the last hour