Problems specifying data.type and sample.type in TCGAbiolinks GDCquery
21 months ago
loughrae ▴ 90

Hey all,

TCGAbiolinks isn't letting me specify sample type (primary tumour).

Here's my code:

x = GDCquery(project="TCGA-LUAD", data.category = "Copy Number Variation", sample.type = "01", data.type = "Copy Number Segment")

But this also happens whether I use sample.type = "Primary Tumor", "TP", "Primary Solid Tumor", or "Primary solid Tumor".

I get the error:

Error in checkBarcodeDefinition(sample.type) : 
01 was not found. Please select a difinition from the table above

But the table includes:

tissue.code |shortLetterCode |tissue.definition                                  |
|01          |TP              |Primary Tumor                                     |

Also, I can't specify "Gene Level Copy Number" in the data.type section. I know an option for allele-specific copy number was only added recently so I'm guessing this hasn't been added to the package yet.

I can work my way around it by filtering after the GDCquery step and before GDCdownload to select only the data_type and sample_type I want (by breaking down the barcode/ID), and to remove duplicates, but is there something going on with the TCGAbiolinks library?

tcga tcgabiolinks r • 1.0k views
21 months ago
EagleEye 7.4k


sample_type is usually represented as,

sample.type = c("Primary solid Tumor","Solid Tissue Normal")

I am not sure if you have to specify sample.type for data.category = "Copy Number Variation". Otherwise try using typesample = c("NT") or typesample = c("TP")

That doesn't work, I get a similar error message:

Error in checkBarcodeDefinition(sample.type) : 
Primary solid Tumor was not found. Please select a difinition from the table above

When I query for the solid tissue normal on its own, though, it can do that.


