TCGA dataset normalization
1
0
Entering edit mode
2.6 years ago

hi. i am new to machine learning. i want to normalize my data which I downloaded from UCSC Xena browser for pancreatic cancer TCGA PAAD is its id. when I try to run this code it is showing error given below

library( "DESeq2" )
library(ggplot2)
countData <- read.csv('PAAD.csv', header = TRUE, sep = ",")
head(countData)
options(max.print = 100000)
metaData <- read.csv('PAADPheno.csv', header = TRUE, sep = ",")
metaData
dds <- DESeqDataSetFromMatrix(countData=countData, 
                              colData=colData, 
                              design=~adenocarcinoma_invasion)

> dds <- DESeqDataSetFromMatrix(countData=countData, 
+                               colData=colData, 
+                               design=~adenocarcinoma_invasion)
Error in `rownames<-`(`*tmp*`, value = colnames(countData)) : 
  attempt to set 'rownames' on an object with no dimensions

please help me resolving this issue or give me the code for it.

machine TCGA DESeq2 learning normalization • 1.5k views
ADD COMMENT
1
Entering edit mode

You haven't defined colData and it defaults to the function colData() from some package (probably DESeq2?). You probably meant to use metaData like this:

dds <- DESeqDataSetFromMatrix(countData=countData, 
                              colData=metaData, 
                              design=~adenocarcinoma_invasion)
ADD REPLY
0
Entering edit mode

Hello. Please show the output of:

str(countData)
str(colData)

Thank you and kind regards

ADD REPLY
0
Entering edit mode

Thanks Kevin, i got following output

 str(countData)
'data.frame':   60488 obs. of  183 variables:
 $ Ensembl_ID      : chr  "ENSG00000000003.13" "ENSG00000000005.5" "ENSG00000000419.11" "ENSG00000000457.12" ...
 $ TCGA.3A.A9IO.01A: num  8.64 3.91 9.86 9.69 6.85 ...
 $ TCGA.US.A774.01A: num  11.04 2 10.51 10.13 7.86 ...
 $ TCGA.HZ.A49H.01A: num  10.84 3.46 10.49 9.43 7.27 ...
 $ TCGA.FB.A4P5.01A: num  9.85 3.58 9.93 8.78 7.17 ...
 $ TCGA.FB.AAPS.01A: num  10.21 5.78 9.78 8.92 7.01 ...
 $ TCGA.IB.AAUQ.01A: num  10.1 2.32 9.94 9.02 7.67 ...
 $ TCGA.HV.A5A5.01A: num  10.89 0 10.08 10.07 7.55 ...
 $ TCGA.H6.A45N.11A: num  8.54 2.58 10 9.67 7.95 ...
 $ TCGA.H6.8124.01A: num  12.17 1 11.18 10.4 8.46 ...
 $ TCGA.IB.7654.01A: num  10.75 5.09 10.73 10.06 7.8 ...
 $ TCGA.3A.A9IJ.01A: num  8.63 0 10.02 9.27 6.19 ...
 $ TCGA.HZ.A8P0.01A: num  10.13 0 8.44 9.69 7.31 ...
 $ TCGA.FB.AAQ6.01A: num  9.74 0 8.78 9.44 7.22 ...
 $ TCGA.IB.A7LX.01A: num  11.69 0 10.75 10.41 9.04 ...
 $ TCGA.2J.AABK.01A: num  11.14 1.58 10.41 10.48 8.52 ...
 $ TCGA.HZ.A9TJ.06A: num  10.69 0 10.82 10.38 8.65 ...
 $ TCGA.S4.A8RP.01A: num  10.8 1 10.06 10.28 7.92 ...
 $ TCGA.2L.AAQM.01A: num  8.23 1.58 9.98 8.48 6.52 ...
 $ TCGA.HZ.A77O.01A: num  10.63 1 10.48 9.32 8.06 ...
 $ TCGA.XD.AAUG.01A: num  10.14 5.09 9.71 8.85 6.74 ...
 $ TCGA.M8.A5N4.01A: num  10.34 1 10.14 9.32 7.68 ...
 $ TCGA.2L.AAQI.01A: num  11.22 0 10.42 9.58 8.34 ...
 $ TCGA.HZ.7919.01A: num  10.8 0 10.91 10.19 8.47 ...
 $ TCGA.HZ.8005.01A: num  10.6 0 11.13 8.84 7.94 ...
 $ TCGA.IB.AAUM.01A: num  10.57 2.81 9.78 9.65 7.34 ...
 $ TCGA.IB.7888.01A: num  10.08 0 10.13 9.72 7.37 ...
 $ TCGA.2J.AAB9.01A: num  9.98 0 9.43 8.45 6.58 ...
 $ TCGA.2L.AAQJ.01A: num  11.11 1 10.41 9.63 7.83 ...
 $ TCGA.IB.AAUU.01A: num  10.37 0 10.58 9.74 8.06 ...
 $ TCGA.Q3.A5QY.01A: num  9.83 7.7 10.24 9.32 7.76 ...
 $ TCGA.FB.AAQ1.01A: num  10.94 0 10.4 9.69 8.21 ...
 $ TCGA.HV.A5A3.01A: num  10.42 0 10.17 9.81 7.95 ...
 $ TCGA.IB.AAUO.01A: num  10.15 0 10.55 9.43 8.19 ...
 $ TCGA.IB.8127.01A: num  12.3 1.58 11.7 10.84 9 ...
 $ TCGA.FB.A545.01A: num  9.79 0 9.89 8.84 7.77 ...
 $ TCGA.YH.A8SY.01A: num  10.32 0 10.31 8.85 7.83 ...
 $ TCGA.IB.A5SQ.01A: num  11.14 3.17 10.1 9.13 8.29 ...
 $ TCGA.HV.A7OP.01A: num  9.87 0 8.65 9.02 6.44 ...
 $ TCGA.IB.7644.01A: num  11 2 10.82 10.7 8.41 ...
 $ TCGA.FB.AAQ2.01A: num  11.2 2.32 10.9 9.17 8.82 ...
 $ TCGA.US.A77J.01A: num  10.04 1 10.02 9.22 7.46 ...
 $ TCGA.F2.7273.01A: num  11.6 2 10.9 10.5 8.2 ...
 $ TCGA.3A.A9I9.01A: num  10.75 3.58 10.42 9.19 7.45 ...
 $ TCGA.US.A77G.01A: num  10.23 1.58 10.35 10.08 8.09 ...
 $ TCGA.IB.7893.01A: num  11.29 0 10.87 9.85 9.14 ...
 $ TCGA.S4.A8RM.01A: num  10.57 2 10.59 11.05 8.05 ...
 $ TCGA.IB.AAUW.01A: num  11.27 3.81 10.64 10.39 7.55 ...
 $ TCGA.FB.AAPZ.01A: num  10.49 5.73 10.24 8.91 7 ...
 $ TCGA.IB.AAUS.01A: num  10.19 1.58 10.3 9.14 7.54 ...
 $ TCGA.IB.AAUT.01A: num  10.78 1 9.8 9.34 7.29 ...
 $ TCGA.3A.A9IS.01A: num  5.32 2.32 10.43 9.81 6.88 ...
 $ TCGA.L1.A7W4.01A: num  12.35 1 11.61 9.53 9.46 ...
 $ TCGA.FB.A7DR.01A: num  10.29 1.58 10.57 9.49 7.86 ...
 $ TCGA.2J.AABO.01A: num  9.69 0 9.7 8.84 6.79 ...
 $ TCGA.3A.A9IH.01A: num  11.17 1.58 10.89 9.62 8.19 ...
 $ TCGA.PZ.A5RE.01A: num  10.4 2 10.7 9.3 7.4 ...
 $ TCGA.HZ.8003.01A: num  10.57 2 10.4 9.69 8.08 ...
 $ TCGA.3A.A9IB.01A: num  11.48 1.58 10.56 9.3 8.17 ...
 $ TCGA.H6.A45N.01A: num  10.34 2 9.5 9.19 7.43 ...
 $ TCGA.HZ.8001.01A: num  11.33 4.95 10.94 9.58 8.06 ...
 $ TCGA.IB.A7M4.01A: num  10.98 0 11.22 9.16 7.87 ...
 $ TCGA.2J.AAB8.01A: num  9.42 2.58 9.53 9.4 7.67 ...
 $ TCGA.2J.AABP.01A: num  10.86 0 11.36 8.51 8.72 ...
 $ TCGA.HZ.A77Q.01A: num  10.3 6.97 10.35 8.77 7.53 ...
 $ TCGA.HZ.7926.01A: num  11.97 2.81 10.94 11.02 8.58 ...
 $ TCGA.XN.A8T5.01A: num  10.3 1.58 10.23 9 7.16 ...
 $ TCGA.IB.A5ST.01A: num  10.77 6.67 10.56 9.89 8.42 ...
 $ TCGA.Z5.AAPL.01A: num  9.39 6.02 10.6 9.58 8.13 ...
 $ TCGA.IB.7652.01A: num  11.28 1.58 11.22 10.33 8.58 ...
 $ TCGA.IB.7886.01A: num  11.94 1 11.31 10.41 8.48 ...
 $ TCGA.FB.AAQ0.01A: num  10.04 0 10.41 9.94 7.73 ...
 $ TCGA.2L.AAQE.01A: num  10.97 1.58 10.63 9.44 8.05 ...
 $ TCGA.F2.6879.01A: num  10.8 1.58 10.92 10.43 8.5 ...
 $ TCGA.HV.A5A4.01A: num  10.43 2.32 9.98 10.32 8.01 ...
 $ TCGA.US.A779.01A: num  10.66 0 10.2 9.46 7.55 ...
 $ TCGA.FB.AAPP.01A: num  10.22 1.58 10.46 9.52 7.51 ...
 $ TCGA.HV.A5A6.01A: num  11.6 2.32 10.58 9.51 7.79 ...
 $ TCGA.HZ.8636.01A: num  11.61 3.17 10.99 11.1 8.94 ...
 $ TCGA.FB.A78T.01A: num  10.6 1 10.4 10.3 8.1 ...
 $ TCGA.IB.7646.01A: num  11.78 0 11.27 9.32 8.33 ...
 $ TCGA.RL.AAAS.01A: num  11.29 2.58 10.21 9.68 7.33 ...
 $ TCGA.US.A77E.01A: num  11.17 0 11.05 9.74 7.45 ...
 $ TCGA.IB.7887.01A: num  11.31 0 11.12 10.64 8.41 ...
 $ TCGA.IB.AAUP.01A: num  10.37 2.32 10.44 9.75 8.48 ...
 $ TCGA.IB.7885.01A: num  11.19 4.46 10.82 9.78 8.6 ...
 $ TCGA.Q3.AA2A.01A: num  10.51 4.09 9.73 9.34 7.17 ...
 $ TCGA.IB.7889.01A: num  11.54 2 10.87 10.14 8.15 ...
 $ TCGA.F2.A8YN.01A: num  11.49 0 10.33 9.3 7.48 ...
 $ TCGA.H6.8124.11A: num  11.87 2.32 10.58 10.19 8.07 ...
 $ TCGA.3E.AAAZ.01A: num  10.25 2.32 10.91 10.28 8.17 ...
 $ TCGA.HZ.8637.01A: num  12.43 5.64 11.63 11.16 9.72 ...
 $ TCGA.HZ.A4BH.01A: num  11.11 5.32 10.62 10.03 8.79 ...
 $ TCGA.IB.7890.01A: num  10.4 0 10.69 9.27 7.77 ...
 $ TCGA.XD.AAUL.01A: num  11.26 1 10.48 9.46 7.9 ...
 $ TCGA.HZ.7923.01A: num  11.12 7.83 10.59 10.58 8.38 ...
 $ TCGA.2J.AABE.01A: num  10.58 1 9.68 10.43 7.25 ...
 $ TCGA.FB.AAPU.01A: num  11.11 1.58 10.75 10.69 9.7 ...
 $ TCGA.3A.A9IC.01A: num  10.55 0 10.11 9.19 8.09 ...
  [list output truncated]
>                               str(colData)
Formal class 'standardGeneric' [package "methods"] with 8 slots
  ..@ .Data     :function (x, ...)  
  ..@ generic   : chr "colData"
  .. ..- attr(*, "package")= chr "SummarizedExperiment"
  ..@ package   : chr "SummarizedExperiment"
  ..@ group     : list()
  ..@ valueClass: chr(0) 
  ..@ signature : chr "x"
  ..@ default   : NULL
  ..@ skeleton  : language (function (x, ...)  stop("invalid call in method dispatch to 'colData' (no default method)", domain = NA))(x, ...)
> 
ADD REPLY
0
Entering edit mode

Why don't you simply download the FPKMs from the TCGA website?

ADD REPLY
2
Entering edit mode

Because FPKMs are terrible for between-samples normalization; furthermore, the GDC website's pipeline is a bit dated (xena uses kallisto and STAR+RSEM which are better options).

For direct download of normalized counts, xena also provides that: https://xenabrowser.net/datapages/?dataset=TCGA-GTEx-TARGET-gene-exp-counts.deseq2-normalized.log2&host=https%3A%2F%2Ftoil.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443

ADD REPLY
0
Entering edit mode
2.6 years ago

thank you for your valuable suggestions :)

ADD COMMENT

Login before adding your answer.

Traffic: 2434 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6