Question: edgeR library size
gravatar for schelarina
5.6 years ago by
European Union
schelarina30 wrote:

Hello everyone, one little question on edgeR. I have the matrix counts file,  a data.frame samples file, and the annotation file. 

the manual says "The data.frame samples contains a column lib.size for the library size or sequencing depth for each sample. If not specified by the user, the library sizes will be computed from the column sums of the counts. For classic edgeR the data.frame samples must also contain a column group, identifying the group membership of each sample."

I tried by introducing a column for the library size like this

             group        lib.size    
sample X     1            8094363
sample X     1            5005492
sample Y     2            7094693
sample Y     2            6094693


so I do like this:

x <- read.delim("counts.txt", stringsAsFactors=FALSE)
group <- (c(1,1,2,2))
genes <- read.delim("genes.txt")
y <- DGEList(counts=x, group=group, genes=genes)
y <- calcNormFactors(y)

but then edgeR ricalculates the library size putting a different number and introduce the normalization factor.

How and where to specify this library size in the correct way or avoid the replacement? 

Thanks for you help

rna-seq R • 7.0k views
ADD COMMENTlink modified 5.6 years ago by Irsan7.3k • written 5.6 years ago by schelarina30
gravatar for Irsan
5.6 years ago by
Irsan7.3k wrote:
If you use the lib.size argument in the DGEList()-function it will not recalculate. So do DGEList(counts=x, group=group, genes=genes, lib.sizes=c(1,2,3,4)) instead. And replace the 1, 2, 3, 4 by real numbers
ADD COMMENTlink modified 5.6 years ago • written 5.6 years ago by Irsan7.3k

thanks! it works perfectly now

ADD REPLYlink written 5.5 years ago by schelarina30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1678 users visited in the last hour