How does reference genome/transcriptome affect gene expression significance?
Entering edit mode
11 months ago
sbitterw ▴ 10

I am examining the impact of stressors on the gene expression of corals using TagSeq. This method relies on a reference transcriptome/genome to identify genes from transcriptome data. Fortunately, there are quite a few transcriptomes/genomes for my organism. However, the number of genes recorded in those reference databases can vary: one genome had ~30,000 genes while another had ~60,000 genes.

In DESeq2 this greatly impacted the analysis with the database with fewer reference genes resulting in more differentially expressed genes. Using the large genome, no genes were differentially expressed.

Am I right in thinking that the number of genes in the dataset impacts the adjusted p-value accounting for FDR? What should I do to address this disparity between genomes? Should I simply rely on a reference transcriptome instead?

gene expression transcriptomics deseq2 • 163 views

Login before adding your answer.

Traffic: 2432 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6