gene expression for a specific gene across for multiple cancer types in TCGA datasets
1
0
Entering edit mode
14 months ago
pt.taklifi ▴ 60

Hello everyone , I am trying to make a box plot of expression of a gene across multiple cancer types ( BRCA, COAD& PRAD ) I know RTCGA package in R can produce boxplots like this :

expressionsTCGA(BRCA.rnaseq, COAD.rnaseq, PRAD.rnaseq,
extract.cols = NULL) %>%
rename(cohort = dataset,
VENTX = VENTX|27287) %>%
filter(substr(bcr_patient_barcode, 14, 15) == "01") %>% #cancer samples
ggplot(aes(y = log1p(VENTX),
x = reorder(cohort, log1p(VENTX), median),
fill = cohort)) +
geom_boxplot() +
theme_RTCGA() +
scale_fill_brewer(palette = "Dark2")


but I'm not sure how to specify the gene of interest. for example how can I get expression of "KLK2" gene across all cancer types I mentioned before ?

gene_expression TCGA RTCAG • 508 views
0
Entering edit mode
14 months ago

I think the TCGA dataset has ensemble Id's as the identifier and also the gene symbols should be there in the expression matrix (I checked it for the PRAD dataset). So, you just have to subset the expression matrix using ensemble (gene of your interest) OR gene name.

exp.mat[exp.mat\$GeneSymbols == "KLK2", ] # Rows as genes and Columns as samples

1
Entering edit mode

Fyi, the little hand icon on the bottom allows you to grap&drag your posts so you could have simply dragged the comment into the answer field without reposting, just for the future ;-)

0
Entering edit mode

Noted thanks :)