Question: Are TCGA data from UCSC cancer browser and TCGAbiolinks different?
gravatar for wenbinm
22 months ago by
wenbinm20 wrote:

Hi there,

I downloaded TCGA BRCA RNAseq data from UCSC cancer browser or used TCGAbiolinks:

query <- GDCquery(project = 'TCGA-BRCA', data.category = 'Transcriptome Profiling', data.type = 'Gene Expression Quantification', workflow.type = 'HTSeq - Counts')
brca.seq <- GDCprepare(query)

And checked the expression of SOX10:

r = rowData(brca.seq)
as.numeric(assay(brca.seq[which(r$external_gene_name == 'SOX10'),]))

It turns out its expression is zero in all patients. But in data from UCSC cancer browser (HiSeqV2) SOX10 average expression is 6. The data from UCSC can be found here:

Another question, TCGAbiolinks is more updated than UCSC caner browser as it directly downloads data from TCGA right?

Thank you!

rna-seq tcga • 825 views
ADD COMMENTlink modified 22 months ago by mary20 • written 22 months ago by wenbinm20
gravatar for mary
22 months ago by
mary20 wrote:


Can you please tell me how you are seeing that the expression is 6 in UCSC Xena? For me I see that it is 0 for all samples in the GDC TCGA BRCA cohort: (sorry about the red color, it is because Xena is not sure how to color the samples when they are all the same value)

While technically the data from TCGAbiolinks will be more up-to-date than UCSC Xena, for this particular data there is unlikely to be a lag since it has been out for a long time.

Best, Mary

ADD COMMENTlink written 22 months ago by mary20

Thank you for your quick response! I downloaded UCSC Xena data from here, unzipped it and opened the file with excel:

Then I took a look at SOX10 expression data and the first 5 numbers are 6.5221 0 8.308 6.3628 0.5819. Maybe I make some mistakes here.......

ADD REPLYlink written 22 months ago by wenbinm20

Ah, that is the legacy TCGA data, not the TCGA data from the GDC. TCGAbiolinks is the data from the GDC, as far as I can tell. The GDC TCGA data on Xena is here:

As to why the legacy TCGA data is different from the TCGA data from the GDC, I recommend contacting the GDC:

ADD REPLYlink written 22 months ago by mary20

the legacy TCGA data came from hg19 version and TCGA data from the GDC now use hg38 version. therefore, it will have some difference.

ADD REPLYlink written 20 months ago by Shixiang60
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 770 users visited in the last hour