How to sum the genewise counts in edgeR ?
1
1
Entering edit mode
3.7 years ago
sunnykevin97 ▴ 980

HI

How to sum the gene-wise counts in edge R for differential genes comparing 2 groups. Looking for two-group comparison with n=2 in each group. I'm new to R literally struck. suggestion please.

The tabulate matrix looks some thing like this
temp  Test Sample GeneI Gene2 Gene2 Gene3 ....Genen
t80 T1  1   55
t80 T1  2   89
t80 T1  3   54
t80 T1  4   453
t80 T1  5   50
t80 T1  6   32
t80 T2  7   45
t80 T2  8   45
t80 T2  9   50
t80 T2  10  54
t80 T2  11  45
t80 T2  12  15
t8  T3  13  43
t8  T3  14  25
t8  T3  15  404
t8  T3  16  385
t8  T3  17  51
t8  T3  18  32
t8  T4  19  454
t8  T4  20  395
t8  T4  21  53
t8  T4  22  35
t8  T4  23  96
t8  T4  24  87

Code
library(edgeR)
matrix.dge <- DGEList(matrix)

sumgenes <- sumTechReps(matrix.dge,ID=colnames(matrix.dge))
sumcounts <- sumgenes$counts nrow(sumcounts) head(sumcounts) tail(sumcounts) The result seems to be same after applying sumTechReps function  RNA-Seq r R • 1.7k views ADD COMMENT 4 Entering edit mode 3.7 years ago For RNA-seq you generally want biological replicates as opposed to technical replicates. That being said, there are a few things that you need to change. Example data. df <- structure(list(temp = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("t8", "t80"), class = "factor"), Test = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("T1", "T2", "T3", "T4" ), class = "factor"), Sample = 1:24, GeneI = c(55L, 89L, 54L, 453L, 50L, 32L, 45L, 45L, 50L, 54L, 45L, 15L, 43L, 25L, 404L, 385L, 51L, 32L, 454L, 395L, 53L, 35L, 96L, 87L), GeneII = c(204, 200, 44, 212, 299, 405, 79, 291, 303, 259, 207, 255, 99, 226, 206, 316, 80, 126, 131, 186, 478, 451, 323, 185)), row.names = c(NA, -24L), class = "data.frame")  First, the matrix should be genes as rows, and samples as columns. library("tidyverse") mat <- df %>% select(-temp, -Test) %>% mutate(Sample = str_c("sample_", Sample)) %>% column_to_rownames("Sample") %>% t %>% as.matrix > mat[, 1:5] sample_1 sample_2 sample_3 sample_4 sample_5 GeneI 55 89 54 453 50 GeneII 204 200 44 212 299  You then want a group data.frame. groups <- df %>% select(temp, Test, Sample) %>% mutate(Sample = str_c("sample_", Sample)) %>% column_to_rownames("Sample") > head(groups, 5) temp Test sample_1 t80 T1 sample_2 t80 T1 sample_3 t80 T1 sample_4 t80 T1 sample_5 t80 T1  If you want to sum the technical replicates, You can then do so after creating the DGEList object using the above matrix and data.frame as inputs. I'm assuming Test denotes the technical replicates? dge <- DGEList(mat, samples=groups) dge <- sumTechReps(dge, dge$samples$Test) > dge An object of class "DGEList"$counts
T1   T2   T3   T4
GeneI   733  254  940 1120
GeneII 1364 1394 1053 1754

\$samples
group lib.size norm.factors temp Test
T1     1     2097            1  t80   T1
T2     1     1648            1  t80   T2
T3     1     1993            1   t8   T3
T4     1     2874            1   t8   T4


Alternatively, instead of summing the technical replicates, you can add it to the regression model to account for it when building the design matrix.

design <- model.matrix(~ temp + Test, groups)

0
Entering edit mode

Rpolicastro - I love seeing your tidyverse-esque responses on here. I always end up learning something new from ya. Quick question though. Whenever you create a sample data.frame, you always use this framework:

structure(list(...... row.names = c(NA,
-24L), class = "data.frame")


Any reason on why you use this rather than using the tibble() function or even data.frame()?

2
Entering edit mode

Thanks, glad to hear my code snippets are helping people!

The format you see is the result of the dput function being run on a data.frame. It's a way to share a longer data.frame (or most other types of R objects) in a slightly more condensed format. I usually wouldn't use dput in my actual code, just as a way to share example data on some place like biostars or stack overflow.

0
Entering edit mode

Thanks for the informative code.

Traffic: 1274 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.