Identifying differentially expressed loci for gene duplicates using DESeq2
Entering edit mode
3 months ago
pl23 • 0


I am studying the gene expression of a species that has undergone a duplication event. I have a synteny table of gene duplicates for multiple tissue types, which was derived using the genome of a related ancestral species (that existed prior to the duplication event).

I want to identify loci where the duplicates have significantly different expressions - I was wondering if I could use DESeq2 to do this. In particular, I was going to set up a table with samples consisting of all tissue x duplicate pairs that looks as follows:

Locus_id | t1_d1_r1 | t1_d1_r2 | t1_d1_r3 | t1_d2_r1 | t1_d2_r2 |t1_d2_r3 | t2_d1_r1 | t2_d1_r2 | t2_d1_r3 |....

Here t denotes the tissue type, d denotes the duplicate (corresponding to subgenomes 1 and 2) and r indicates one of three replicates. I was then considering constructing a design matrix that can identify differentially expressed loci - for example conducting a log ratio test to see if the duplicate factor is significant in the design.

My question is whether this violates assumptions of deseq2 framework. I assumed that because the gene pairs are duplicates, it is okay to determine the means and dispersion estimates for each gene pair.

Any feedback on this is much appreciated.


rna-seq deseq2 • 232 views

Login before adding your answer.

Traffic: 2174 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6