Question: TCGA/ICGC do not match raw count expression data
gravatar for zamalloa
4.8 years ago by
United States
zamalloa20 wrote:


I'm trying to obtain raw counts for rnaseq expression data for breast cancer. I've extracted the data from the TCGA portal for RNAseq V1 for breast cancer instead of V2 because the latter does not posses "true" raw counts as pointed out elsewhere (non-integers) :

I was also guided to the ICGC data portal with the hopes of obtaining an already parsed table, which I downloaded for rnaseq raw counts as well (exp_seq.BRCA-US.tsv). However, when I tried to double check if both sites (TCGA/IGCG) were in agreement in term of raw counts data for the same individual, I found out that this was not the case. For example in TCGA I find that:

ACAP3 4832 2580
ACAT1 8202 1916

while for ICGC, the same samples raw count values were:

ACAP3 0 1148
ACAT1 0 896

Both sites (TCGA / ICGC state that they are representing raw counts for RNAseq expression data. Am I misinterpreting something here, is there an extra-normalization step not shown? 

I would appreciate any help, thanks!

rna-seq gene expression icgc tcga • 2.1k views
ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by zamalloa20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1272 users visited in the last hour