This is a place that really suffers in the TCGA annotation. The file handles are different from the TCGA IDs and so you have to map each "barcode" back to the TCGA ID and determine if it is tumor/normal etc. I think that information comes as part of the download through the TCGA portal mentioned above. If not, I have a master "chemistry file" that has the mappings for all arrays run by TCGA as updated a few months ago.
It's also important to realize that the pipeline TCGA uses is best suited for identifying large deletions/duplication events in tumor or relative to other samples. As for common CNVs, you'll need to run something like Birdsuite to detect that on the Level 1 data from each array since the TCGA's GISTIC algorithm does not care about common CNVs/CNPs.
modified 5.0 years ago
5.0 years ago by
Ryan D • 3.3k