Its sound quite trivial but I couldnt find the way somehow. I have an RNASeq with 3 replicates of treatment and 9 controls. As I noticed something strange with controls looking at their raw counts by eye, I normalised and clustered them. They clustered in 3 different groups. Even this tells me a difference between them anyways, I wanted to see whether their deseq2 output would look any similar. I picked 2 of the 3 clusters and ran deseq2. So the structure is as
TreatmentRep1 TreatmentRep2 TreatmentRep3 Control1 Control2 Control3 and
TreatmentRep1 TreatmentRep2 TreatmentRep3 Control4 Control5 Control6 . Now what I want to see is correlation between the two deseq2 outputs keeping the geneNames as my constant point. Probably the best way to compare them would be on log2fc values. Accordingly what I have is a dataframe like this
geneNames L2FC_Comp1 L2FC_Comp2 GeneA x1 y1 GeneB x2 y2 GeneC x3 y3 GeneD x4 y4 ... ...
Many statistical correlation methods use the whole population and their mean or another parameter to compare but in this case I also care about the geneNames that the comparison should consider each geneName for correlation.