Using GenomeDataCommons and associating UUIDs with RNA-seq files
0
0
Entering edit mode
4.4 years ago
KRR • 0

I'm using the GDC package to obtain gene expression data. Right now, I'm attempting to associate individual TCGA RNA-seq files with their corresponding case UUID or patient barcode. In other words, I want to know which case/patient each sequencing file came from. I've tried available_fields(files()) and grep_fields('files', 'xxxx') many times without luck. Any suggestions?

#this gives me the file_ids that I use to transfer the rna-seq data for the 587 sequenced samples (from 560 cases/patients)
    file_ids = q %>% filter(~ cases.project.project_id=='TCGA-UCEC' &
                           data_type=='Gene Expression Quantification' &
                          analysis.workflow_type == 'HTSeq - FPKM') %>%
   GenomicDataCommons::select('file_id') %>%
   response_all() %>%
   ids()
fnames = transfer(file_ids)

#this gives me the case ids and patient barcodes for the 560 individual cases/patients
resp = cases() %>% filter(~ project.project_id=='TCGA-UCEC') %>% GenomicDataCommons::select("submitter_id")  %>% results_all()

Any help is much appreciated!

GDC R RNA-Seq GenomicDataCommons • 803 views
ADD COMMENT

Login before adding your answer.

Traffic: 3146 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6