Entering edit mode
2.3 years ago
Rob
▴
170
Hi friends I am using the following code to get the data from TCGA. I want to have only one allocate of each person then I will have unique patients ID. Is there any line of code that I should add to this to get this? IS there any code to get/omit specific samples?
library(TCGAbiolinks)
library(SummarizedExperiment)
BiocManager::install("BioinformaticsFMRP/TCGAbiolinks")
CancerProject <- "TCGA-LGG"
query <- GDCquery(project = CancerProject,
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
sample.type = c("Primary Tumor"),
workflow.type = "HTSeq - Counts")
#download raw counts for DESEq2
GDCdownload(query)
data <- GDCprepare(query, save = TRUE, save.filename = "expression.rda")
rna <- as.data.frame(SummarizedExperiment::assay(data)) # exp matrix# this go to coding filter(above)
write.csv(rna, "htseqRNA.csv")