RNA seq with edge R
1
0
Entering edit mode
5.8 years ago
31jsheetal • 0

We are running an edgeR script in the classic mode i.e. without the design matrix. The thing is we get the result of the edgeR but we are unable to identify to which genes which FDR or Log FC etc belongs. The output looks something like what I have attached below. Please elaborate on how can we call out for the Geneid column in the output as well. Attached is my CSV output file. I'm not sure what the first column is, but I'm assuming it is the row number. Thank you in advance

Comparison of groups:  CPeM-CPeC 
            logFC        logCPM       PValue          FDR
48178   9.2968063  8.7531497859 1.339698e-88 7.015328e-84
50934  -8.8743164  9.2802946181 4.600196e-84 1.204446e-79
45623  -9.0342436  7.9331539929 6.593547e-80 1.150904e-75
49291   8.5775628  7.0893005292 8.493058e-78 1.111848e-73
37120  -7.9915437  6.7177772577 8.642303e-74 9.051084e-70
45475  -7.7424580  7.6449239664 5.610419e-73 4.896493e-69
32316   7.0196112  7.8150257401 2.529690e-64 1.892389e-60
32293   7.0582453  6.3341181139 1.568164e-62 1.026462e-58
40924  -7.3083772  7.9254370926 7.675463e-60 4.465840e
rna-seq • 1.1k views
ADD COMMENT
0
Entering edit mode

Please add the code you used to generate this result.

ADD REPLY
0
Entering edit mode

You'll need to annotate your genes first. Make rownames your unique IDs before the edgeR fit. How does your count matrix looks like? Give more code and/or examples of how your data looks like.

ADD REPLY
0
Entering edit mode
5.8 years ago
h.mon 35k

edgeR uses the row names of the counts slot to identify the genes. Mock code:

y <- DGEList( counts = counts, group = treatment )
y <- calcNormFactors( y )
y <- estimateDisp( y, design )
fit <- glmFit( y, design )
lrt <- glmLRT(fit, contrast = contrast )
tt <- topTags( lrt, sort.by = "none", n = "NULL" )

inspect the counts slot of your DGEList object:

head( y$counts, n = 2 )

Output:

         A1   A2   A3   B1   B2   B3
130541  437  416  455  433  380  412
128741 5290 6167 4543 6453 6016 7418

Inspect the table slot of your TopTags object:

head( tt$table )

Output:

              logFC    logCPM         LR     PValue       FDR
130541    0.08218954  4.967360 0.09360040 0.75964894 0.9786278
128741   -0.09887031  8.734469 0.35966293 0.54869349 0.9316098

If you add a genes slot, this information will be added to the output of the topTags object:

y <- DGEList( counts = counts, group = treatment )
y$genes <- genes
y <- calcNormFactors( y )
y <- estimateDisp( y, design )
fit <- glmFit( y, design )
lrt <- glmLRT(fit, contrast = contrast )
tt <- topTags( lrt, sort.by = "none", n = "NULL" )

Inspect the genes matrix

head( genes, n = 2 )

Output:

  Accession Uniprot Gene.Names
1    130541  A0AVT1       UBA6
2    128741  A0JNA3     IMPDH1

The counts slot of the DGEList is the same:

head( y$counts, n = 2 )

Output:

         A1   A2   A3   B1   B2   B3
130541  437  416  455  433  380  412
128741 5290 6167 4543 6453 6016 7418

But now inspect the topTags object:

head( tt$table, n = 2 )

Output:

       Accession Uniprot Gene.Names       logFC   logCPM        LR    PValue       FDR
130541    130541  A0AVT1       UBA6  0.08218954 4.967360 0.0936004 0.7596489 0.9786278
128741    128741  A0JNA3     IMPDH1 -0.09887031 8.734469 0.3596629 0.5486935 0.9316098
ADD COMMENT
0
Entering edit mode
result <- merge(tt$table, y$counts, by=row.names)

just in case check dim(tt$table) and dim(y$counts)

ADD REPLY

Login before adding your answer.

Traffic: 2742 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6