Entering edit mode
3.1 years ago
ruby4bioinfo
•
0
hi. i am new to machine learning. i want to normalize my data which I downloaded from UCSC Xena browser for pancreatic cancer TCGA PAAD is its id. when I try to run this code it is showing error given below
library( "DESeq2" )
library(ggplot2)
countData <- read.csv('PAAD.csv', header = TRUE, sep = ",")
head(countData)
options(max.print = 100000)
metaData <- read.csv('PAADPheno.csv', header = TRUE, sep = ",")
metaData
dds <- DESeqDataSetFromMatrix(countData=countData,
colData=colData,
design=~adenocarcinoma_invasion)
> dds <- DESeqDataSetFromMatrix(countData=countData,
+ colData=colData,
+ design=~adenocarcinoma_invasion)
Error in `rownames<-`(`*tmp*`, value = colnames(countData)) :
attempt to set 'rownames' on an object with no dimensions
please help me resolving this issue or give me the code for it.
You haven't defined colData and it defaults to the function colData() from some package (probably DESeq2?). You probably meant to use metaData like this:
Hello. Please show the output of:
Thank you and kind regards
Thanks Kevin, i got following output
Why don't you simply download the FPKMs from the TCGA website?
Because FPKMs are terrible for between-samples normalization; furthermore, the GDC website's pipeline is a bit dated (xena uses kallisto and STAR+RSEM which are better options).
For direct download of normalized counts, xena also provides that: https://xenabrowser.net/datapages/?dataset=TCGA-GTEx-TARGET-gene-exp-counts.deseq2-normalized.log2&host=https%3A%2F%2Ftoil.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443