Using GenomeDataCommons and associating UUIDs with RNA-seq files
23 months ago
KRR • 0

I'm using the GDC package to obtain gene expression data. Right now, I'm attempting to associate individual TCGA RNA-seq files with their corresponding case UUID or patient barcode. In other words, I want to know which case/patient each sequencing file came from. I've tried available_fields(files()) and grep_fields('files', 'xxxx') many times without luck. Any suggestions?

#this gives me the file_ids that I use to transfer the rna-seq data for the 587 sequenced samples (from 560 cases/patients)
    file_ids = q %>% filter(~ cases.project.project_id=='TCGA-UCEC' &
                           data_type=='Gene Expression Quantification' &
                          analysis.workflow_type == 'HTSeq - FPKM') %>%
   GenomicDataCommons::select('file_id') %>%
   response_all() %>%
fnames = transfer(file_ids)

#this gives me the case ids and patient barcodes for the 560 individual cases/patients
resp = cases() %>% filter(~ project.project_id=='TCGA-UCEC') %>% GenomicDataCommons::select("submitter_id")  %>% results_all()

Any help is much appreciated!

GDC R RNA-Seq GenomicDataCommons

