Firebrowse Data vs GDC Portal Data
3.7 years ago
jmw730 • 0

Hello All,

I noticed that there is a difference in the Breast Cancer data pulled from the GDC Portal ( and the Firebrowsen Data ( . What is the difference between these two?

Also, I know that the "minus_germline" sample of the firebrowse data is created by comparing the tumor data to a standard blood derived normal sample that is estimated. Is it necessary to compare each sample to its original blood derived normal sample that is unique to each patient , on top of the standard comparison this website has already done?

My goal is to be sure that there is no chance of any private natural copynumber changes from the patients data so we know what is really a tumor related change.

Any comments are appreciated!

Thank you, Jameelah

R gdc firebrowse tcga cnv • 1.5k views
3.7 years ago

You are evidently referring to the copy number data that has been processed by the Broad Institute and put on their FireHose / FireBrowse server. That data is taken from the Genomic Data Commons (GDC) and represents segmented copy number data. The segmentation algorithm is Circular Binary Segmentation (CBS) and the tool used in DNAcopy, as far as I am aware.

It makes sense that it's different from that which is hosted on the GDC as there is no synchronisation between both sites (the GDC updates the data over time), and Broad Institute may have applied additional filtering.

If you want to process the Broad's FireBrowse data, please go through the following steps, starting here: C: How to extract the list of genes from TCGA CNV data



