Background gene list for functional analysis
1
0
Entering edit mode
3.4 years ago

I am analyzing RNA-seq data from cell lines that have been modified to overexpress and underexpress gene 'A'. I performed a likelihood ratio test using DESeq2, as well as a clustering test with degpatterns(), and identified two small groups of genes (of 36 and 44 genes) whose expression is directly linked to that of 'A'.

I would like to perform functional analysis with these two groups, but am unsure what to use as a background gene list for programs such as GOrilla. I am also unsure if the small size of these gene groups will pose an issue.

R RNA-Seq • 744 views
ADD COMMENT
0
Entering edit mode
3.4 years ago
ATpoint 82k

I personally use all genes that actually went into the final analysis for such a task you describe, which usually means having sufficient counts to be considered for the differential analysis. Non-expressed and super lowly-expressed genes make little sense to include as they are not adding any information to the analysis.

In DESeq2 that could be the genes not having NAs in padj (so surviving the independent filtering) and for edgeR that could be all genes surviving the filterByExpr filtering function. In single-cell RNA-seq I even filter more by only considering genes that (when comparing clusters of cells) are expressed by at least 10% of cells in at least one of the clusters I am comparing. It all comes down to actually focus on the genes that meaningfully contribute to the analysis, therefore should be considered as background.

ADD COMMENT

Login before adding your answer.

Traffic: 3063 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6