Are there certain mutations ubiquitous in ALL cancers?
4
1
Entering edit mode
8.4 years ago
gooshiroy ▴ 20

I am trying to test out an algorithm for analyzing multivariate data sets. I was trying to get it tested on cancer gene data. Are there any genes that are known to have mutations in ALL or MOST cancers (I need at least 10)? I understand this might be a tricky question as cancers exhibit a wide array of mutations and they are context specific. The only ones I can think of off the top of my head that are found in all cancers or in most cancers are RAC and P53. That being said, is there any way I can find those specific mutations for specific cancers in TCGA? Like for example, pretend I wanted a prostate cancer RAC sequence. I understand there could be 100 in literature. Is there anyway I can pick any one from TCGA? I'm having a hard time finding out how.

Alternatively: I would like to see if it possible to select multiple diseases from TCGA and filter by common mutational data.

gene mutations cancer • 2.5k views
ADD COMMENT
4
Entering edit mode
8.4 years ago

MSKCC's cBioPortal is a good place to start exploring for such mutations that are agnostic of tissue of origin. At this link you will find a (shameless plug) list of genes recurrently altered across 12 different cancer types. Use that to pick your candidate genes like TP53, PIK3CA, PTEN, KRAS, etc. that you can then feed into cbioportal.org. You'll first get a cross-cancer summary. But then click on the "Mutations" tab to see the mutations plotted across genes. Mouse over recurrent mutations like KRAS G12 or PIK3CA H1047 to see their distribution across cancer types. Be sure to click the "Customize" button to reduce the y-axis limit, which will help reveal lower frequency hotspot mutations like KRAS G13 and PIK3CA R88.

Here's what I see reported for KRAS G13 mutations:

Cancer type                        Count
Colorectal Adenocarcinoma          22
Lung Adenocarcinoma                17
Mixed Cancer Types                 15
Stomach Adenocarcinoma             10
Endometrial Carcinoma              9
Multiple Myeloma                   9
Cervical Squamous Cell Carcinoma   2
Acute Myeloid Leukemia             1
Acute Lymphoid Leukemia            1
Pancreatic Adenocarcinoma          1

Here is another paper where we systematically looked for recurrent somatic SNVs across ~11k tumors

ADD COMMENT
0
Entering edit mode

This is phenomenal! Thank you so much!

EDIT: I was wondering, is there any way I could get the actual sequence for certain cases?

ADD REPLY
0
Entering edit mode

Ensembl's Variant Effect Predictor (VEP) has a plugin that generates a FASTA file containing mutated mRNA sequence. It will only do this for one mutation at a time. If this is what you want, you can find what you need here and here.

ADD REPLY
2
Entering edit mode
8.4 years ago
Emily 23k

BRAF V600E occurs in about 80% of melanomas.

ADD COMMENT
2
Entering edit mode
8.4 years ago
Collin ▴ 1000

If you are interested in specific hotspot missense mutations, you may want to query MuPIT (shameless plug) with your gene of interest. It lists amino acid residues that have statistically significant higher local mutation density in 3D protein structures (i.e. missense hotspots) categorized by 31 TCGA cancer types. This is probably a more objective way to indicate whether mutations at a certain residue are cancer drivers rather then eyeballing recurrent codon positions in cBioPortal. From my work developing the statistical algorithm for detecting missense hotspots, there really are not driver missense mutations that appear in absolutely all cancer types, which speaks to the heterogeneity in cancer, although a few well known genes have mutations that appear in the majority of cancer types available in TCGA.

You can follow Cyriac’s suggestion of identifying relevant cancer driver genes. Some of the most prominent for hotspot missense mutations across cancer types which may fit what you are looking for are FBXW7 (residues 465, 479, 505, etc.), PIK3CA (1047, 542, 545, etc.), KRAS (12, 13, and 61), HRAS (12, 13, and 61), BRAF (600, etc.), and TP53 (many residues). You can query your gene by using the following format for the URL: http://mupit.icm.jhu.edu/?gene=GENENAME, for example KRAS would be http://mupit.icm.jhu.edu/?gene=KRAS. Looking at the "TCGA 3D Mutation Hot Regions" column, you can see which cancer types have hotspots. Clicking the "+" button will show you how many hotspot regions there are. Hovering over each hotspot region, for example "hr_1" will pop up a tool tip telling you which residues have statistically significant hotspot mutations. To show the hotspot region on the protein structure, click on the hotspot region (e.g. "hr_1"). One inconvenience, though, is that sometimes protein structures have different residue numbering conventions than typically used by the gene, for example some protein structures of BRAF may list a hotspot at residue 599 rather than 600, but it is actually the same residue. You can switch between protein structures by clicking on a different PDB ID in the left column. The benefit of looking at hotspots on protein structures is that it may give you a clue about the functional effect of the missense mutation. You could then follow Cyriac's suggestion of using cBioPortal for obtaining the mutation data for the relevant hotspot residues.

ADD COMMENT
0
Entering edit mode

Every one in this question has given great answers, and you are definitely no exception. I like this link!

ADD REPLY
2
Entering edit mode
8.4 years ago

To have a quick look, you can check ICGC data portal instead of TCGA. Just click on 'Genes' and you will get the list of most frequently mutated genes across all ICGC samples. Obviously, the most commonly mutated gene in cancer is TTN! Which is a well known false positive. I recommend you to click the filter "Curated Gene Set", to remove all these false positive.

ADD COMMENT
0
Entering edit mode

This link is very, very helpful. Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 1487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6