Help with rownames in DESeq2
1
1
Entering edit mode
2.9 years ago

Hello,

I would like to identify DEGs using TPM values. I tried to use DESeq2, but I have the following problems:

  1. I am unable to add the row names to dds (i.e., GeneName column)
  2. When I run the following command, I get NI vs INF, but I want INF vs NI.

     head(results(dds))
     log2 fold change (MLE): condition NI vs INF 
     Wald test p-value: condition NI vs INF 
    
  3. When I run the following command, it seems that none of the genes are differentially expressed?

     summary(res)
     out of 13537 with nonzero total read count
     adjusted p-value < 0.1
     LFC > 0 (up)       : 0, 0%
     LFC < 0 (down)     : 0, 0%
     outliers [1]       : 0, 0%
     low counts [2]     : 0, 0%
    

Please find below my RStudio code. Any help would be greatly appreciated!

data <- read_xlsx('consolidated_E6_uniqueEntrez.xlsx', col_types = c("text", "text","numeric",
                                                                     "numeric", "numeric", "numeric", 
                                                                     "numeric", "numeric", "numeric",
                                                                     "numeric", "numeric", "numeric", 
                                                                     "numeric", "numeric", "numeric",
                                                                     "numeric", "numeric", "numeric", 
                                                                     "numeric", "numeric", "numeric"))
countData = subset(data, select = -c(2:3,10:21))
countData[2:7] = round(countData[2:7])
head(countData)
condition <- factor(c("NI","NI","NI","INF", "INF", "INF"))
dds <- DESeqDataSetFromMatrix(countData[2:7], DataFrame(condition), ~ condition)
dds <- DESeq(dds)
res <- results(dds)
head(results(dds))
summary(res)
res <- res[order(res$padj),]
head(res)

Here is some sample data to be copy-pasted into Excel:

GeneName    Ensembl Entrez  Rep1_NI Rep2_NI Rep3_NI Rep1_2H Rep2_2H Rep3_2H Rep1_4H Rep2_4H Rep3_4H Rep1_24H    Rep2_24H    Rep3_24H    Rep1_48H    Rep2_48H    Rep3_48H    Rep1_72H    Rep2_72H    Rep3_72H
ND1 ENSCSAG00000000006  4097488 4110.730463 4131.993785 4087.582005 4024.139901 3851.016173 3757.790828 3998.049618 3912.999404 4044.403123 4526.664036 4633.834177 5047.25019  6134.65549  5925.493844 6407.851586 3567.479753 3915.825956 3493.691943
        4097489 7202.293784 7505.274219 7416.532184 7261.130425 6914.710434 6902.883947 7142.045748 6900.201632 7338.265332 8034.619537 7921.774297 8323.872366 10253.41941 10260.34556 10739.36826 6348.270665 7162.65797  6755.730024
COX1    ENSCSAG00000000016  4097490 21384.45223 21987.93671 21825.5789  21617.45141 20763.02822 20915.73658 22630.16873 21491.28902 22610.08263 24445.68713 26301.14111 27767.59481 37523.96086 34038.05812 37527.05638 19097.41744 20777.49926 18706.34217
COX2    ENSCSAG00000000019  4097491 23095.60045 23542.92338 23418.81883 22767.2568  21760.67324 22149.33884 24579.50753 23361.15654 24872.55587 25057.28837 27954.21184 29220.84293 43074.0174  38003.93848 43583.93291 20638.99672 22222.78778 20745.37562
        4097492 5089.669191 5363.956133 5042.436534 4889.031287 4756.58504  4702.559977 5432.475975 4357.91711  6333.815406 6000.407015 6427.774149 7794.755878 10305.94286 6906.059378 10511.36126 3642.25344  4610.064222 4170.488097
ATP6    ENSCSAG00000000022  4097493 9296.656979 9507.431938 9541.225027 9369.987036 9119.406172 9130.525392 9031.922461 8768.969377 9854.904055 10355.86243 10238.51812 10933.51711 15294.24182 15197.36213 15475.66016 7369.062066 8541.825339 7717.407252
COX3    ENSCSAG00000000023  4097494 18633.28087 19031.5007  18800.65399 18328.28186 17420.79453 17676.2335  19476.15652 18495.23372 17481.76947 18501.52107 21415.3908  22493.82674 29348.94285 26333.54602 29711.65716 15166.94307 15309.9478  13980.56502
ND3 ENSCSAG00000000025  4097495 6053.840221 6496.294811 6306.063948 6159.738491 5699.921284 5845.43029  6784.832132 6135.961202 6430.580503 6124.878577 7122.378002 7450.594629 9532.935006 8217.661519 9867.530632 4807.277917 5132.830172 4566.287442
ND4L    ENSCSAG00000000027  4097496 3311.669848 3214.888318 3162.089408 3247.723677 3338.095974 3313.913594 3195.697588 3177.157622 3507.153985 3865.459905 3489.855648 3853.403791 4746.8711   4819.779155 5598.992146 3293.592594 3562.085461 3202.293475
        4097497 3531.56501  3558.490838 3532.2502   3469.721939 3412.83731  3386.551035 3581.593146 3387.027005 3559.146834 3884.66661  4032.860009 4354.580206 5481.890311 5185.26059  6063.371589 3179.03591  3703.553617 3512.869858
ND5 ENSCSAG00000000032  4097498 3185.053547 3396.433059 3309.02917  3179.120118 3112.448217 3134.942829 3183.121599 3094.888274 3437.671457 3881.341634 3806.613015 4202.261961 4984.598775 4749.806669 5030.47665  2825.005634 3308.830413 3006.98428
ND6 ENSCSAG00000000033  4097499 2179.610931 2524.153988 2390.96798  2310.382064 2231.591578 2169.338037 2107.076037 2053.47766  2545.422104 3000.333837 2721.252442 2900.246292 3270.253013 3361.821955 3581.443632 2109.956144 2332.450964 2341.165057
CYTB    ENSCSAG00000000035  4097500 6504.187555 6681.987408 6468.61815  6223.624596 6089.743156 6091.83994  6279.194574 5953.573005 6052.442338 6503.095228 6827.626412 7381.096164 9809.020695 9260.460982 10029.526   5083.416387 5866.879716 5378.585669
        103214198   0   0   0   0   0   0.406842071 0   0   0   0   0   0   0   0   3.857021143 0   0   0
        103214199   0.091602853 0   0.263525123 0   0   0.16810691  0   0   0   0.290650873 0.730208225 0   0   0   0   0   0   0
POMP    ENSCSAG00000017967  103214204   250.8924746 235.8245571 246.6273473 233.3534873 232.5269866 266.9342397 259.9697477 253.8927513 340.9242996 261.0761231 225.8040377 224.7217782 172.1777868 223.572983  195.5672692 244.7094931 274.2675829 231.8107293
    ENSCSAG00000017964  103214206   1.646489112 1.680412241 1.403453757 1.648370105 1.054760923 2.853723178 1.855713275 1.117336931 0.897245698 1.741406805 2.187486243 2.425307054 0.186223214 1.357711311 0.401598731 1.414298484 0.324851815 0.776316738
        103214207   0   0   0.470966693 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
        103214209   0   0   0   0.263890443 0   0.537479433 0   0.923908838 0.861849767 0   0   0   0   1.086791852 0   0.905670038 0   0
SLC46A3 ENSCSAG00000017966  103214210   1.107977247 1.653802677 2.257779593 1.871847614 2.088532074 1.906244843 1.826327794 2.330147525 1.494371469 1.97749775  2.760060372 1.836078442 3.907859803 3.083565603 3.01199048  2.569666833 3.65194041  4.852681273
SLC7A1  ENSCSAG00000017961  103214212   70.23190777 71.30655971 73.95672218 68.39453869 69.04207317 65.28071626 62.70838081 60.53020528 61.23602196 60.29275546 60.45525252 63.28088067 41.66788746 44.3756089  41.43542714 49.20232179 43.29114503 51.5730348
        103214213   0   0   0   0   0   0   0   0   0   0   0   0   0.525761964 0   0   0   0   0
UBL3    ENSCSAG00000017959  103214215   22.18419275 26.83817721 19.19787962 27.7890055  21.11853668 27.80317337 31.83334586 23.32739321 26.00643845 33.47780795 20.12816908 45.19086965 39.25837851 48.18724552 47.06873259 27.60758582 27.58349793 33.40281859
KATNAL1 ENSCSAG00000017958  103214218   3.380910206 3.10551134  3.491484244 3.046295383 2.883710573 3.818183832 4.352800364 3.828607976 3.443888478 4.332237093 3.455228327 4.654514943 6.289869198 4.342740559 4.524733561 2.814770832 5.943266497 3.471377856
        103214221   0   0.514441993 0   0   0   0   0   0.41188574  0   0.621422265 0.520404021 0   0   0   0   0   0   0
        103214228   0.498806163 0.9163498   2.869953288 4.044945702 2.654816046 2.288486648 1.011943645 1.049021493 3.424955063 1.582684831 2.650807982 3.306375632 4.021257521 5.552827121 4.339148786 6.169877135 0.876846109 8.322492808
        103214229   11.08863427 9.381277664 11.69667738 9.83341665  10.14686547 8.996639282 11.48472788 9.757625884 11.50656097 6.481195651 9.071869364 12.37926379 12.46810376 14.50976398 10.15376681 9.113822167 9.335926394 9.153220609
ALOX5AP ENSCSAG00000017950  103214231   5.123539849 4.344176832 3.779362354 3.551118312 1.748027026 3.978011358 2.798461439 2.486569466 4.252501525 1.875774614 1.047232783 0   0   0   0   0.609370581 0   0
        103214236   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
        103214237   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
HSPH1   ENSCSAG00000017943  103214241   218.0021013 216.6297866 216.7488509 221.8750297 204.3250273 209.5838165 331.512267  316.7727395 323.2000169 246.9209433 257.3413029 279.2536375 328.3668074 303.1149774 353.6785115 302.1300487 324.8189134 318.7156268
DESeq2 DEG Rstudio R • 3.7k views
ADD COMMENT
3
Entering edit mode

TPM is not an acceptable input for DESeq2. I think you can get Limma to work on normalized data.

ADD REPLY
1
Entering edit mode

Please use the "code formatting" button (101010) so that your posts can be more readable in the future. I did it for you for this one time.

ADD REPLY
4
Entering edit mode
2.9 years ago

1. I am unable to add the row names to dds

Add this line somewhere before calling the subset function:

data=as.data.frame(data)
row.names(data)=data$GeneName

2. I get NI vs INF, but I want INF vs NI.

You need to relevel your factor condition, so that NI becomes the base level. Add this line somewhere before calling the DESeqDataSetFromMatrix function:

condition=relevel(condition, "NI")

3. none of the genes are differentially expressed?

There could be multiple reasons for this. Have you performed a PCA on the data to see if the variability is driven by the condition effect ? EDIT: also, as swbarnes2 wrote, it is wrong to use normalized counts as input for DESeq2. Try to use raw counts instead.

ADD COMMENT
0
Entering edit mode

Thanks a lot for this useful information! As suggested, I am now using unique gene reads instead of TPM. Now, I obtain DEGs! Also the PCA shows variability based on condition.

Concerning the solution to:

1. I am unable to add the row names to dds

I get the following error:

row.names(countData)=data$GeneName
Warning message:
Setting row names on a tibble is deprecated. 
ADD REPLY
1
Entering edit mode

Oh that makes sense, my bad. I should have written this:

row.names(data)=data$GeneName

You should use this before subsetting your dataset (subset function). I edited my answer accordingly.

ADD REPLY
0
Entering edit mode

Hello,

Thank you for this! Unfortunately, I still cannot seem to get the row names in dds.... rownames: NULL

> dds
class: DESeqDataSet 
dim: 11078 6 
metadata(1): version
assays(4): counts mu H cooks
rownames: NULL
rowData names(22): baseMean baseVar ... deviance maxCooks
colnames(6): Rep1_NI Rep2_NI ... Rep2_2H Rep3_2H
colData names(2): condition sizeFactor
ADD REPLY
1
Entering edit mode

This should work because DESeq2 transfer the row names of countData to the DESeqDataSet object. Are you sure that you have run every code line from the start ? Does countData have row names ?

head(countData)
ADD REPLY
0
Entering edit mode

Hello,

Now I get this error...

   > row.names(data)=data$GeneName
   Warning message:
    Setting row names on a tibble is deprecated. 


> head(countData)
# A tibble: 6 x 7
  GeneName Rep1_NI Rep2_NI Rep3_NI Rep1_2H Rep2_2H Rep3_2H
  <chr>      <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1 A1CF           0       0       0       0       0       0
2 A2M            0       0       1       0       1       0
3 A2ML1          0       1       0       0       0       0
4 A4GALT         0       0       0       0       0       0
5 A4GNT          0       0       0       0       0       0
6 AACS          67      84      64      80      64      75
ADD REPLY
2
Entering edit mode

try this:

data=as.data.frame(data)
row.names(data)=data$GeneName
ADD REPLY
1
Entering edit mode

Wonderful!! Thank you so much! It works now! :)

> dds
class: DESeqDataSet 
dim: 11078 6 
metadata(1): version
assays(4): counts mu H cooks
rownames(11078): A1CF A2M ... ZYG11B ZZZ3
rowData names(22): baseMean baseVar ... deviance maxCooks
colnames(6): Rep1_NI Rep2_NI ... Rep2_2H Rep3_2H
colData names(2): condition sizeFactor
ADD REPLY
0
Entering edit mode

I suffered with this last week also, always need to make sure the table (or just force them) is either a data.frame or matrix before running your codes.

ADD REPLY

Login before adding your answer.

Traffic: 2583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6