Hi guys, I have two matrix with expression values for near 19,000 transcripts for two different groups in two different matrix: group1 (n=30) and group2 (n=60). So for each transcript, I have around 90 different value.
data format for Group 1 but is the same for Group 2
GENE                Group1           Group1          Group1         Group1
ENSG00000183154.1   0.925443037     3.0369279927    2.7557872516    2.2384197806
ENSG00000227210.1   2.0079999555    3.2941987268    1.7373249132    1.8805827534
ENSG00000240591.1   1.833712588     2.8881138203    2.8422437594    1.654280957
ENSG00000279342.1   2.1612707148    2.3112540357    3.4992176678    2.3862284068
ENSG00000248383.4   2.2085886874    2.7214426016    0                0
ENSG00000253837.1   1.384282608     3.4071437949    1.7373249132    1.3734792256
ENSG00000107165.11  1.6305563       1.2443063796    2.3422637913    3.8709988618
So for each gene I want to know if there are differences between groups and also if there are differences in the expression average. I'm planing to do a chisq using R, but not sure which could be the best method, any suggestion?
Thanks!
Do you have access to the raw read counts? If so, please read https://f1000research.com/articles/4-1070/v2
Yes, this is exactly what I did, in more detail, the matrix is derived from varianceStabilizingTransformation, but I can't find a method to get the differences among each transcript per group and neither if there are differences for the average expression. I have the PCA, the heatmap... but I need those values also, any idea?
Okay I'm slightly confused, it sounds like you want to do a differential expression analysis (an example of which you can find in the workflow I linked). Is that correct?
Yes it is! But I couldn't find a way to get the differential expression for each gene (" to say if ENSG00000183154.1 is more expressed in group1 and group2). So is there a function to do that?
If you look at the "Building the results table" section of that workflow you'll see that the "results" function will give you a table of genes which are either up- or down- regulated in either group.
So the function
res <- results(dds)that report a pvalue for each gene is giving me that information? or should I use instead sum(res$pvalue < 0.05, na.rm=TRUE) for statically diferences?sum(res$pvalue < 0.05, na.rm=TRUE)will give you the number of DEGs.results(dds)will give you the list.You also might find useful to: DESeq2 proper design setting
And how can I export the transcripts ID that are deferentially expressed in both groups? is the output a data frame?
I don't remember what class the output is, but typing
class(results(dds))will tell you so. Anyway, you just need towrite.table(results(dds), file="yourfilenamechoice.table")to export them into a file.many thanks for the information!