You are referring to a post that I made. From where did you obtain the original data? - Broad Firebrowse (GISTIC 2.0-identified somatic copy number alterations) or just downloaded the original files from GDC?
If you followed the data processing exactly as follows:
Then, the statistically significant recurrent somatic copy number alterations (sCNA) are held in the *.igv.gistic files. You can extract statistically significant regions from this file and then pull out the original copy number over these on a per sample basis using GenomicRanges - the copy number that you take is indeed the Segment Mean from the original copy number identified by GISTIC 2.0 (or some other tool that outputs a segment mean).
If you do that, then you can build a matrix of:
- statistically significant recurrent sCNAs in a group of patients as
- patients as columns
- Segment Mean over each region as the values
With that, I generated this and identified clusters of patients based on recurrent sCNA via Partitioning Around Medoids (PAM)::
Of course, you don't have to use that data, exactly, but you really have to know to what your data relates.