Edit: original function written by Bioinfo (via Sean Davis' blog) for translating file UUIDs into TCGA barcodes ( C: Sample names for TCGA data from GDC-legacy archive ). This function (below) translates file names into TCGA barcodes.
A manual lookup of 507 samples is not that bad, if the desire is really there to get the work done. I have done manual lookups of >1000 TCGA samples back when there were no automated services.
The one solution that I thought would work was this function:
TCGAtranslateID = function(file_names, legacy = TRUE)
info = files(legacy = legacy) %>%
filter( ~ file_name %in% file_names) %>%
id_list = lapply(info$cases,function(a)
barcodes_per_file = sapply(id_list,length)
return(data.frame(file_id=rep(ids(info),barcodes_per_file), submitter_id=unlist(id_list), row.names=file_names))
AMAZE_p_TCGASNP_b86_87...seg.txt 6352ceaf-99f4-4b74-94a2-dc5e405543f0 TCGA-BJ-A0Z9-01A