I am looking to download a dataset for breast cancer (microarray or RNA-seq) that has breast cancer classification available from traditional methods such as IHC/FISH to compare with my genetic fingerprint based subtypes. Not looking for those that have PAM50 subtypes. Any recommendations ? Thx
The TCGA has that information (where available) in the clinical information biotab files. Take a look at my answer here in order to give you an idea for breast cancer: A: How to download triple negative breast cancer RNA-seq fpkm data from GDC.
Edit: Following the first part of the short tutorial in my linked thread, I have just downloaded the breast cancer TCGA data and can already see the following data:
To download other data from the TCGA, you can stay on the GDC Legacy Archive and filter the different checkboxes on the left-hand-side in order to select out whatever you need. A lot of data is open access.
METABRIC also profiled by IHC and FISH, I believe, but it is a more 'locked down' dataset and they (from what I understand personally) are careful about the groups to whom they grant access. Take a look: