ICGC exp_array data double measurements?
0
0
Entering edit mode
6.7 years ago
pulyakhina ▴ 20

Hi everyone,

I'm working with publicly available ICGC expression data, namely file "exp_array.ALL-US.tsv".

I noticed that a few genes seem to have double measurements: for the same gene (e.g., NM_015092), I see two lines which are exactly the same (same donor ID, same sample ID, same analysis ID, etc), and the only difference is the normalized expression value:

DO2 ALL-US  SP2 SA4 ... ... RefSeq  NM_015092   2275.018 ...
DO2 ALL-US  SP2 SA4 ... ... RefSeq  NM_015092   1587.806 ...

When I save each of these two lines as a separate file, remove the normalized expression value and check for diff, I see that there is no difference, so the rest of the lines truly is the same.

Does anyone know why this happened, and if I wish to use the data, which value (or which combination of values) should I use? Thank you in advance!

Kind regards,

Irina

ICGC expression duplicates • 1.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 2407 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6