Hi all,
I want to analysis Htseq Counts data to find differantial expression genes.
I have installed TCGAbiolinks package and created query but its not equal to tcga data portal.
here my R code and result, I searched for Kidney cancer and Normal Tissue.
As you can see the difference my R code result is only 128 files, but in tcga data portal I see 215 files.
why results are different?
query <- GDCquery(
project = c("TCGA-KIRC","TCGA-KIRP","TARGET-WT","CPTAC-3","TCGA-KICH"),
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
experimental.strategy = "RNA-Seq",
workflow.type = "HTSeq - Counts",
sample.type = "Solid Tissue Normal",
legacy = FALSE)
here my tcga data portal query
cases.case_id in ["set_id:AW3tFMDMgWoF7ReWISKV"] and cases.samples.sample_type in ["Solid Tissue Normal"] and files.analysis.workflow_type in ["HTSeq - Counts"] and files.data_category in ["Transcriptome Profiling"] and files.data_type in ["Gene Expression Quantification"] and files.experimental_strategy in ["RNA-Seq"]