Question: Help finding specific breast cancer datasets
0
gravatar for ogola89
15 months ago by
ogola8910
ogola8910 wrote:

Hi All,

I am pretty new to bioinformatic analysis and so I apologize if this seems like a simple/redundant question. I am wondering whether specific datasets are available - or how to filter existing datasets to find specific data.

I am looking for RNA-Seq data, for breast cancer which has metastasized to the bone, so RNA-seq of the secondary tumour. Furthermore, I would like to access data as to the time to recurrence so I can identify how long it took to outgrowth in the secondary site.

Please help me identify how to go about finding the data for this type of specificity.

Thank you.

bone breast cancer metastasis • 375 views
ADD COMMENTlink modified 10 months ago • written 15 months ago by ogola8910
1

Hi Kevin,

Thank you so much for this answer, I found some of the datasets I was looking for!

ADD REPLYlink written 10 months ago by ogola8910
4
gravatar for Kevin Blighe
15 months ago by
Kevin Blighe63k
Kevin Blighe63k wrote:

TCGA has 17 [breast] primary tumour samples that eventually metastasised to bone; however, the RNA-seq is of the primary tumour. In total, only 7 breast cancer samples in TCGA are actual metastatic samples, and none of these are bone mets, from what I can see. I got this information from my 2 answers, here: C: TCGA metastatic samples

Then there is this study that looked at bone mets, but it is microarray data: Latent bone metastasis in breast cancer tied to Src-dependent survival signals (GSE14020)

Probably your best bet is to browse the curated datasets at the relatively new Human Cancer Metastasis Database. Go to Browse, filter for 'breast cancer', and then look for 'bone' in the Metastasis Site column.

Kevin

ADD COMMENTlink modified 15 months ago • written 15 months ago by Kevin Blighe63k

Hi Kevin,

Do you happen to have an R/bioconductor or even a Python script for querying and downloading TCGA files? I am not finding it so easy to download the relevant data and view it as a dataframe object.

Thanks!

ADD REPLYlink written 10 months ago by ogola8910

Is it now resolved?

ADD REPLYlink written 10 months ago by Kevin Blighe63k

What do you mean by it?

I found the datasets and have an excel sheet of the IDs of the datasets, but I was wondering if you had a script that could help me automate the finding and downloading of the datasets through multiple queries with the TCGA/GDC portal

ADD REPLYlink written 10 months ago by ogola8910

Can you elaborate on which IDs you have, specifically?

ADD REPLYlink written 10 months ago by Kevin Blighe63k

Hi Kevin, I downloaded all the data in csv, filtered for primary 'breast' and secondary 'bone' or 'bone,liver,lung,other' and got 107 matches of microarray data from GEO (GSM) or other sequencing platforms from TCGA.

GSM352100 GSM352103 GSM352105 GSM352109 GSM352117 GSM352119 GSM352123 GSM352124 GSM352126 GSM352131 GSM352144 GSM352149 GSM352151 GSM352154 GSM352155 GSM352159 GSM352163 GSM352167 GSM352100 GSM352103 GSM352105 GSM352109 GSM352117 GSM352119 GSM352123 GSM352124 GSM352126 GSM352131 GSM352144 GSM352149 GSM352151 GSM352154 GSM352155 GSM352159 GSM352163 GSM352167 GSM1124888 GSM1124890 GSM1124904 GSM1124927 GSM1124953 GSM1124888 GSM1124890 GSM1124904 GSM1124927 GSM1124953 GSM1124888 GSM1124890 GSM1124904 GSM1124927 GSM1124953 GSM1124888 GSM1124890 GSM1124904 GSM1124927 GSM1124953 GSM1124888 GSM1124890 GSM1124904 GSM1124927 GSM1124953 GSM1312932 GSM1312938 GSM1312944 GSM1312946 GSM1312948 GSM1312953 GSM1312955 GSM1362546 GSM1362582 GSM1362588 GSM1362605 GSM1362631 GSM1362546 GSM1362582 GSM1362588 GSM1362605 GSM1362631 GSM1362546 GSM1362582 GSM1362588 GSM1362605 GSM1362631 GSM1362546 GSM1362582 GSM1362588 GSM1362605 GSM1362631 TCGA-A2-A3XU-01A TCGA-AC-A2FE-01A TCGA-B6-A3ZX-01A TCGA-HN-A2OB-01A TCGA-A2-A3XU-01A TCGA-AC-A2FE-01A TCGA-AR-A5QQ-01A TCGA-B6-A3ZX-01A TCGA-HN-A2OB-01A TCGA-A2-A3XU-01A TCGA-AC-A2FE-01A TCGA-AR-A5QQ-01A TCGA-B6-A3ZX-01A TCGA-HN-A2OB-01A TCGA-A2-A3XU-01A TCGA-AC-A2FE-01A TCGA-AR-A5QQ-01A TCGA-B6-A3ZX-01A TCGA-HN-A2OB-01A

These are the sample IDs in totality

ADD REPLYlink written 10 months ago by ogola8910

You’ve mentioned RNAseq in the original post - are you ok with Microarray data also?

ADD REPLYlink written 10 months ago by RamRS28k

Yes, I am okay with Microarray data as the datasets I have to choose from are very limited.

ADD REPLYlink written 10 months ago by ogola8910
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1218 users visited in the last hour