Question: CNV from TCGA data
8 months ago
emiliakozlowskaa10


I need to analyze copy number variation data from TCGA. I have downloaded all data but several things are not clear with the data.

  1. The first question is why we have 4 files per patients? Which file I should look at?
  2. I have for example sample id SOURS_p_TCGAb22_SNP_N_GenomeWideSNP_6_C01_529802. How to link sample id to TCGA barcode? I would like to link CNV to clinical data thus conversion from sample to barcode is necessary. How I can do it?

Thank you in advance for your help.

I don't blame you for coming here to ask a question on this. Getting familiar with the TCGA data takes some amount of time.

From where did you download the data, exactly? When you downloaded the data, there may have been a file 'manifest' that you could also have downloaded, which would most likely contain a UUID and/or TCGA barcode connected to your filenme. This file manifest may have been in json format which is interpretable in R.

To save yourself a lot of hassle, and depending on the cancer in which you're interested, you could just download CNV data from the Broad Firehose:


