Question

How to access TCGA samples that were treated with a specific drug?

0

Entering edit mode

6 months ago

Qroid ▴ 40

I'm new to working with TCGA, but I'd like to look at RNAseq expression of a certain gene between responders and non-responders of a given treatment. My understanding is there are a number of tools for accessing TCGA data. I'd prefer to use the R package TCGAbiolinks, for example following something like this vignette.

My issue is that I don't know how to explore the metadata for patient samples that had a certain treatment and their drug response. Basically, I don't know which samples to download! I see TCGAbiolinks vignettes for searching the GDC database, but I don't see how to search by a specific drug. I've also tried exploring on https://portal.gdc.cancer.gov/, but again, I don't know how to search by a specific drug.

I'm interested in repeating this procedure for a few drug treatments (e.g. taxanes).

TCGA RNA-seq • 990 views

ADD COMMENT • link 6 months ago by Qroid ▴ 40

1

Entering edit mode

At the GDC portal you have to turn the "therapeutic agents" toggle under "Clinical Data Analysis" on.

drug

For Breast cancer you can see

drug

Unfortunately the drug info does not appear to be searchable.

ADD REPLY • link 6 months ago by GenoMax 147k

0

Entering edit mode

Thank you! Naive question, but does that list include all treatments in TCGA? I.e. is there anything missing from there, or that's it?

Also are you aware of a way to do this in R, e.g. using TCGAbiolinks?

ADD REPLY • link 6 months ago by Qroid ▴ 40

1

Entering edit mode

No that was only from Breast cancer. You could try selecting all data and see if you are able to see all treatments in the set.

No idea about how to do this in R.

Zhenyu Zhang participates here and seems to have insights into TCGA data.

ADD REPLY • link 6 months ago by GenoMax 147k

0

Entering edit mode

Sorry, I should have been more specific. By "that list" I mean what's populated in the Therapeutic Agents tab when no filters are applied. I see 96 agents including "missing".

I want to reproduce a result where authors claimed to split TCGA samples by anti-PD-L1 response, but I'm not immediately seeing anti-PD-L1 (e.g. Atezolizumab, Avelumab, Durvalumab) drugs in that tab. I'm new to TCGA and anti-PD-L1 drugs, though, so I'm not sure where my error is.

ADD REPLY • link 6 months ago by Qroid ▴ 40

1

Entering edit mode

These TCGA samples were taken 10 ~ 20 years ago, and there is no way they have been treated by the modern immunotherapy.

Are you sure that paper was talking about real treatment, instead of predicted response based on molecular signatures?

ADD REPLY • link 6 months ago by Zhenyu Zhang ★ 1.2k

0

Entering edit mode

Thank you. Maybe it's best to confirm with the authors about how they're getting responder/non-responder labels. I'll do that now.

ADD REPLY • link 6 months ago by Qroid ▴ 40

0

Entering edit mode

It is possible that these drugs may not have been directly used in TCGA. Authors could have looked for mutations known to be acted on by these drugs and looked for the presence of those mutations in TCGA samples. e.g. https://jeccr.biomedcentral.com/articles/10.1186/s13046-022-02332-2#MOESM2 says the following

by transcriptomic analysis of The Cancer Genome Atlas (TCGA) dataset we found that DDR mutant NSCLC displayed high STING pathway gene expression.

ADD REPLY • link 6 months ago by GenoMax 147k

0

Entering edit mode

Thanks. Really appreciate your help with this. This site is such a great resource.

The methods aren't totally clear on how they're getting the PD-L1 responder/non-responder labels. Although the legend makes it sound like the data is TCGA patients who received anti-PD-L1. I'm specifically looking at the middle panel in A) where they've split by anti-PD-L1 response.

Legend

(Relevant Methods section) RNA-seq analysis of patient treatment and outcome data: Kaplan-Meier curves were generated using TCGA RNA-seq data through the cBio portal, splitting the cohorts according to an ITGB6 mRNA expression threshold of 0.25 standard deviation above and below the mean for the ITGB6 high and ITGB6 low cohorts, respectively. ITGB6 expression and ICB response were evaluated with the Kaplan-Meier Plotter tool and the ITGB6 expression levels between ICB responders and non-responders were quantified using the ROC Plotter tool [19, 20]. The data was extracted using Python and the figures were generated using Prism.

ADD REPLY • link 6 months ago by Qroid ▴ 40