Question

edgeR RPKM Values

1

Entering edit mode

6.1 years ago

gtasource ▴ 60

Hello,

I am trying to use edgeR to calculate RPKM values. However, when I have searched for help on this topic. I've still not been able to make sense of the existing R codes to work for me. Here's my current code:

data_raw <- read.table("counts.txt", header = TRUE)
group <- c(rep("Sample1",2),rep("Sample1",2))
d <- DGEList(counts = data_raw, group=group)
gene.data <- read.table("gene_lengths.txt", header=FALSE)
m <- match(rownames(d), gene.data$Transcript)
gene.lengths <- gene.data$TranscriptLength[m]
rpkm<-rpkm(d,gene.data)

My gene lengths file looks like this:

Gene1   35029
Gene2   72475
Gene3   48792
Gene4   46840

As you can see, I understand that there is limitation in only having two reps. But if anybody could help with this code, or lead me to a tutorial, that would be great.

bioconductor R edgeR • 5.2k views

ADD COMMENT • link updated 5.9 years ago by Biostar 20 • written 6.1 years ago by gtasource ▴ 60

0

Entering edit mode

What is the problem? Do you get an error message? What message, at which step?

ADD REPLY • link 6.1 years ago by h.mon 35k

0

Entering edit mode

After the RPKM step, I receive this errror:

Warning message:
In Ops.factor(left, right) : ‘/’ not meaningful for factors

ADD REPLY • link 6.1 years ago by gtasource ▴ 60

0

Entering edit mode

Yes because your code is wrong.

Example:

rpkm<-rpkm(d,gene.data)

’gene.data’ has extra column of gene names in the first column. It just needs gene length colum. Try this in your above code, it might work.

rpkm<-rpkm(d$counts,gene.data[,2])

Else below mentioned complete sample code will perfectly work.

ADD REPLY • link 6.0 years ago by EagleEye 7.5k

0

Entering edit mode

group <- c(rep("Sample1",2),rep("Sample1",2))

Why do you have same sample names ('Sample1') for both group. I assume it is,

group <- c(rep("Sample1",2),rep("Sample2",2))

Sample Code:

   data_raw <- read.table("counts.txt", header = T, sep='\t', row.names=1) # Assumed it has column with 1-transcriptids, 2-Sample1.1, 3-Sample1.2, 4-Sample2.1, 5-Sample2.2

    data_raw[,"Transcript"] <- rownames(data_raw) # Adding "Transcript" ids column

    gene.data <- read.table("gene_lengths.txt", header=T, sep='\t') # Assumed has two columns 'Transcript' and 'TranscriptLength'

    m <- merge(data_raw,gene.data,by="Transcript")

    group <- c(rep("Sample1",2),rep("Sample2",2))

    d <- DGEList(counts =m[,c(2:5)], group=group) # 1-Transcript, 2-Sample1.1, 3-Sample1.2, 4-Sample2.1, 5-Sample2.2, TranscriptLength

    rpkm<-rpkm(d$counts,m$TranscriptLength)

ADD REPLY • link 6.1 years ago by EagleEye 7.5k