Hello,
I need your help to resolve this question, maybe is easy but I am new about it. I would be appreciated if give me please any idea about that.
actually, I have 8 sample (count file from HTSeq-count of RNA-seq data, with this order: D1_1, D1_2, D3_1, D3_2, R1_1, R1_2, R3_1, R3_2) and make them into 4 group (2 sample in each one). Now I am going to calculate the proportion of gene expression for each group. I have log scaled and normalized them, but I want to merge the ratio of gene expression of each 2 sample together (for ex D1_1 and D1_2) to do the next analysis. could you give me any advice to do that?
my dataset is like this:
D1_1 D1_2 D3_1 D3_2
ENSMUSG00000000001.4 1.39484378430357 2.46452589579488 2.15638312017686 1.39484378430357
ENSMUSG00000000003.15 0.211843606133756 0.360646924628625 0.307483505135223 0.211843606133756
but I want to have D1 (D1_1 and D1_2) versus D3 (D3_1 and D3_2). I need to merge or don't know should be gotten average or what else?
thanks in advance
Good description of the data and problem. Please post some example data and expected output, if possible
Is there a reason you are not using an established package like
deseq2
to do this analysis?No, I did differential Gene expression analysis by DESeq2 but now I want to just compare gene type of each group on based of gene expression value, not DGE analysis, you mean I can do this work by DESeq2, if so, how?
What do you mean by compare gene type? Is there any reason you can't use the DESeq2 logFoldChange values to compare between the groups?
because I want to just calculate the proportion of gene type (Protein coding, miRNA, MiscRNA and etc) on each group, not DGE!
Still not sure I understand completely. Can you maybe post an example of the output you want/expect?
yes, I expect to have a result like this:
So you want to
merge
(e.g. D1_1 and D1_2 to get just D1, in the example you provide it does not seem to be addition/average) and then annotate each gene row?No, just to know can I average them to compare them to others as one sample (D1) or not?
Comparing groups of samples is really best done by something like boxplots, as they give a better indication of the variation within each group. However, with only two samples in each group, boxplots aren't an option, so yes, I suppose taking the average is as good an option as any. I would do it with a grain of salt though, and show the actual sample counts/fold changes in any figures you plan to make.