TCGA biolinks: peripheral blood healthy control
Hey community, i hope this questiong isn't off topic in here. I have downloaded the Acute Myeloid Leukemia RNA seq raw counts data from TCGA with the TCGAbiolinks package using the followin code

library(TCGAbiolinks)
query <- GDCquery(project = "TCGA-LAML",
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
workflow.type = "HTSeq - Counts",
sample.type = "Primary Blood Derived Cancer - Peripheral Blood")


Followed by the download and prepare functions. However, I don't understand how to retrieve the same type of data from healthy blood to use it as a control for DEG analysis. If I run the same script with sample.type = "Blood Derived Normal" i get an error message stating there is no result matching my query. Anyone can help me out?

11 months ago
bruce.moran ▴ 870

This has argument barcode and typesample (which is what you enter in the sample.type parameter above).

query <- GDCquery(project = "TCGA-LAML",
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
workflow.type = "HTSeq - Counts")
laml <- GDCprepare(query)


then this will return the barcodes of all sample in one of the normal categories:

TCGAquery_SampleTypes(barcode=laml$barcode, typesample=c("NBM", "NEBV", "NBC", "NB")  NB that there are no normals available for LAML using legacy = FALSE, cannot test on legacy = TRUE currently. ADD COMMENT 0 Entering edit mode thank you for you help, but I'm not able to solve the problem. The returning message is always that there is no matching result even when I do the query for the TARGET-AML project. Moreover, the second code expects something else and does not return the sample barcode. ADD REPLY 0 Entering edit mode OK, you asked the question about TCGA-LAML, but now you want info on TARGET-AML, those are different datasets obviously so methods to interrogate them don't necessarily transfer between them. What version of TCGAbiolinks are you using? I had issues recently and upgraded to TCGAbiolinks_2.17.1 using BiocManager:::install("BioinformaticsFMRP/TCGAbiolinks"). Post your error message if you're getting one. FWIW I used same code to download TARGET-AML, table(laml@colData$sample_type)does not show any 'normal' samples. Do you expect there should be?

       Primary Blood Derived Cancer - Bone Marrow
119
Primary Blood Derived Cancer - Peripheral Blood
26
Recurrent Blood Derived Cancer - Bone Marrow
40
Recurrent Blood Derived Cancer - Peripheral Blood
2

You're right, I mentioned both the TCGA and TARGET projects but that's because for what i need to do they are somewhat equivalent. Anyway, from what I can see on the GDC website, samples of normal peripheral blood should be available, as well as many normal bone marrow ones. I checked my version of TCGA biolinks and it's updated to the latest release. Btw I also found a bug report in the github issue section of the package, but I am not completely sure it actually is a bug. As for now my problem is not solved but i'll update the post when/if i find a solution. Many thanks!

