I am trying to find the corresponding TCGA sample IDs for a set of DLBC tumour samples of CNV data (downloaded in May 2018) using their file UUIDs. I am using the code provided here. The problem is that the function described in that link returns an empty table, i.e. the UUIDs of my samples are not found in GDC.
I tried to manually search for a UUID of one of my samples (
da4b04a1-700b-4022-a56c-11329b8106cc) in the GDC repository (https://portal.gdc.cancer.gov/repository) and I could not find it! The file appears to have been deleted, but I am not sure about that. I then tried to filter the repository to only keep TCGA-DLBC Masked CNV samples (files) and looked for my sample by its file name (
XYLEM_p_TCGASNP_207_212_N_GenomeWideSNP_6_A01_1051280.nocnv_grch38.seg.txt). I found a file having the exact name as my sample, with the exception that it ends with .seg.v2.txt (v2 added between seg and txt). However, the file has a completely different UUID (
Could this mean that the v2 file is a newer version of the one I am looking for ? If it is the case, how is it possible to find the samples I have by their file UUIDs ?
Thank you very much for your help!