Enrichment analysis on TCGA data
2
0
Entering edit mode
5.2 years ago

Hi,

I have a list of differentially expressed genes between tumor and normal samples determined from TCGA data (BRCA). I am trying to cluster the genes into related groups and determine gene annotations for each cluster. The clustering is no problem, but annotating the clusters is proving to be difficult. I have tried using the TCGA_Biolinks function TCGAanalyze_EAcomplete, however I do not know how to make sense out of the output of this function. I have followed the tcgabiolinks vignette, and made the EABarplot (I could only get this to work for the dataset provided, not for a different dataset downloaded from the gdc), however I dont know how to get access to the genes that make up the different bars in the plot. I would like to find, for instance, a set of genes that are related to T-cell activity.

Thank you, David

0
Entering edit mode

0
Entering edit mode

Hi Ben,

If the query is not related to the main topic discussed, please make a separate post to discuss.

Tip: Also check previous posts before making redundant queries, Example: TCGA: how to download matched normal and tumor samples from TCGA website.

0
Entering edit mode
5.2 years ago
EagleEye 7.3k

Method 1:

1) Use GeneSCF to extract complete Gene Ontology as simple text (tab separated) format.

./prepare_database -db=GO_all -org=goa_human


2) Extract the genes related to you keyword 'T cell'

grep "T cell" geneSCF-master-source-v1.1-p2/class/lib/db/goa_human/GO_all_sym.txt | cut -f2 | sed s'/.$//' | sed 's/,/\n/g' | awk '!x[$0]++'


3) Overlap those genes with your list of genes

Method 2:

If you are confident that the genes you have, are related to 'T cell', try simple gene enrichment analysis with GeneSCF and check if the 'T cell activity' terms pops up.

0
Entering edit mode

Hi EagleEye,

Thank you very much for your response. That seems to be what I am trying to do, however I would like to keep my analysis in R. Is there a tool that works in R that does a similar task?

Thank you, David

0
Entering edit mode

I do not know if this task can be easily done with R. Alternatively you can import the table/TEXT file obtained with GeneSCF ('geneSCF-master-source-v1.1-p2/class/lib/db/goa_human/GO_all_sym.txt') into R and match your gene list.