I have a MAF file. It is mutational information available for TCGA Pan cancer dataset.
In this dataset I have two sets of barcode: Tumor_Sample_Barcode and Matched_Norm_Sample_Barcode. I have a set of genes for which I want to do downstream analysis for gene expression difference in tumor and normal samples. I came across the package tcgabiolinks. However, when I give it my barcodes it gives the following error:
'none of the barcodes were matched. Available barcodes are above'
The code I use is as follows:
listSamples <- c("TCGA-FW-A3R5-10A-01D-A23B-08")
# Query platform Illumina HiSeq with a list of barcode
query <- GDCquery(project = "TCGA-SKCM,
data.category = "Gene expression",
data.type = "Gene expression quantification",
experimental.strategy = "RNA-Seq",
platform = "Illumina HiSeq",
file.type = "results",
barcode = listSamples,
legacy = TRUE)
I can see the barcode I search for is not in the list provided but why is that? I am using the same reference genome hg19. And I am using tcga dataset then shouldn't the barcode match? How can I fix this issue? Also, is there another way through which I can study the differential gene expression for my set of selected genes?