Hi Biostars!,

I am looking to calculate Aneuploidy across a number of tumor samples downloaded from TCGA and ICGC and I had a couple questions regarding the provided copy number data, specifically the segment mean column (which I think is the most relevant data for my cause).

From what I understand from the documentation and previous questions on biostars the values are in Log2 - from this can I read the following into the data?

  • A value of 0 should represent CN2
  • If so CN2 represents diploidy and variations represent aberrations
  • These changes in copy number are a result of the gain or loss of segments of chromosome

I am rather new at CNV work so please let me know if I am completely missing something or barking up the wrong data column.

Thanks for your time, have a great weekend!




You have it about right.  Variants represent aberrations, yes.  A value of 0 is probably close to CN2 state, but without some detail on the methods, it isn't really easy to tell.  Some tumors may have total copy number significantly less than or more than 2, in which case the 0 may not represent CN2 state.  

Hi Sean, That's very helpful, thank you!

I ended up assuming that segment mean LRR change across longer segments is more likely to signal changes in ploidy whilst shorter segments are more likely to be a result of duplication...  For my purposes that assumption seems to be working O.K. though I am keen to experiment to see how accurate it is.

Thanks again,


