Issues in Gene Set Enrichment Analysis using TCGAbiolinks
Entering edit mode
4.3 years ago
ammarsabir15 ▴ 70

I want to perform Gene Set Enrichment Analysis on Glioblastoma Multiforme dataset in TCGA using GO or KEGG pathway. For this purpose I downloaded data from TCGA using this code. `

query <- GDCquery(project = "TCGA-GBM",
                   data.category = "Transcriptome Profiling",
                  data.type = "Gene Expression Quantification")


This downloaded the data according to the given parameters but when I tried to prepare this query using the command given below :`

 data <- GDCprepare(query)

Then following error came Unable to prepare query there are duplicates in the data. I tried to remove duplicates using fdupes but the software found no duplicate files in the data sets.

So regarding this I have following questions,

  • How this error can be removed.?

  • For doing enrichment analysis do I need datasets from all workflows i.e HTseq_counts, HTseq_FPKM and HTseq_FPKM_UQ or any one or two from these can suffice?

  • Getting the data what are the next steps to perform the enrichment analysis using GO or KEGG pathway ?
bioconductor TCGAbiolinks R • 1.4k views

Login before adding your answer.

Traffic: 1106 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6