get case submitter ID for GDC miRNA quantification files
1
3
Entering edit mode
7.7 years ago
mt1022 ▴ 310

I downloaded hundreds of miRNA quantifications files from GDC and wanted to analyze its association with clinical information. The problem is how to get corresponding sumitter ID (previous TCGA barcode) of each file.

This is an example: https://gdc-portal.nci.nih.gov/files/b2804bb2-70f4-471a-b6db-70c0ef457df3

I can view the case UUID (79e469c5-c18c-4c20-aaa2-8866623229d9) and view the submitter id (TCGA-BP-4343) by clicking the link. Where can I downloaded submitter id for hundreds of files?

Thanks for any hint.

TCGA GDC UUID miRNA • 4.0k views
ADD COMMENT
3
Entering edit mode
7.7 years ago
mt1022 ▴ 310

following the instruction on https://gdc-docs.nci.nih.gov/API/Users_Guide/Search_and_Retrieval I figured out how to download the uuid and barcode associated with each miRNA quantification file.

first determine how many files there are curl 'https://gdc-api.nci.nih.gov/files/ids?query=mirnas.quantification.txt&pretty=true'

we can see that there are 11488 files in total:

{
  "data": {
    "pagination": {
      "count": 5, 
      "sort": "", 
      "from": 1, 
      "page": 1, 
      "total": 11488, 
      "pages": 2298, 
      "size": 5
    },

then, we can retrieve all the files with:

curl 'https://gdc-api.nci.nih.gov/files/ids?query=mirnas.quantification.txt&pretty=true&size=11488&format=TSV' >submitter_ids.tsv

additional fields can be added to select columns.

ADD COMMENT

Login before adding your answer.

Traffic: 2662 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6