Question: Are there certain mutations ubiquitous in ALL cancers?
gravatar for gooshiroy
4.2 years ago by
United States
gooshiroy20 wrote:

I am trying to test out an algorithm for analyzing multivariate data sets. I was trying to get it tested on cancer gene data. Are there any genes that are known to have mutations in ALL or MOST cancers (I need at least 10)? I understand this might be a tricky question as cancers exhibit a wide array of mutations and they are context specific. The only ones I can think of off the top of my head that are found in all cancers or in most cancers are RAC and P53. That being said, is there any way I can find those specific mutations for specific cancers in TCGA? Like for example, pretend I wanted a prostate cancer RAC sequence. I understand there could be 100 in literature. Is there anyway I can pick any one from TCGA? I'm having a hard time finding out how. 

Alternatively: I would like to see if it possible to select multiple diseases from TCGA and filter by common mutational data. 

cancer mutations gene • 1.3k views
ADD COMMENTlink modified 4.2 years ago by Giovanni M Dall'Olio26k • written 4.2 years ago by gooshiroy20
gravatar for Cyriac Kandoth
4.2 years ago by
Cyriac Kandoth5.5k
Memorial Sloan Kettering, New York, USA
Cyriac Kandoth5.5k wrote:

MSKCC's cBioPortal is a good place to start exploring for such mutations that are agnostic of tissue of origin. At this link you will find a (shameless plug) list of genes recurrently altered across 12 different cancer types. Use that to pick your candidate genes like TP53, PIK3CA, PTEN, KRAS, etc. that you can then feed into You'll first get a cross-cancer summary. But then click on the "Mutations" tab to see the mutations plotted across genes. Mouse over recurrent mutations like KRAS G12 or PIK3CA H1047 to see their distribution across cancer types. Be sure to click the "Customize" button to reduce the y-axis limit, which will help reveal lower frequency hotspot mutations like KRAS G13 and PIK3CA R88.

Here's what I see reported for KRAS G13 mutations:

Cancer type Count
Colorectal Adenocarcinoma 22
Lung Adenocarcinoma 17
Mixed Cancer Types 15
Stomach Adenocarcinoma 10
Endometrial Carcinoma 9
Multiple Myeloma 9
Cervical Squamous Cell Carcinoma 2
Acute Myeloid Leukemia 1
Acute Lymphoid Leukemia 1
Pancreatic Adenocarcinoma 1

Here's another paper where we systematically looked for recurrent somatic SNVs across ~11k tumors

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by Cyriac Kandoth5.5k

This is phenomenal! Thank you so much!

EDIT: I was wondering, is there any way I could get the actual sequence for certain cases?

ADD REPLYlink modified 11 weeks ago by RamRS25k • written 4.2 years ago by gooshiroy20

Ensembl's Variant Effect Predictor (VEP) has a plugin that generates a FASTA file containing mutated mRNA sequence. It will only do this for one mutation at a time. If this is what you want, you can find what you need here and here.

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by Cyriac Kandoth5.5k
gravatar for Emily_Ensembl
4.2 years ago by
Emily_Ensembl20k wrote:

BRAF V600E occurs in about 80% of melanomas.

ADD COMMENTlink written 4.2 years ago by Emily_Ensembl20k
gravatar for Collin
4.2 years ago by
United States
Collin700 wrote:

If you are interested in specific hotspot missense mutations, you may want to query MuPIT (shameless plug) with your gene of interest. It lists amino acid residues that have statistically significant higher local mutation density in 3D protein structures (i.e. missense hotspots) categorized by 31 TCGA cancer types. This is probably a more objective way to indicate whether mutations at a certain residue are cancer drivers rather then eyeballing recurrent codon positions in cBioPortal. From my work developing the statistical algorithm for detecting missense hotspots, there really are not driver missense mutations that appear in absolutely all cancer types, which speaks to the heterogeneity in cancer, although a few well known genes have mutations that appear in the majority of cancer types available in TCGA.

You can follow Cyriac’s suggestion of identifying relevant cancer driver genes. Some of the most prominent for hotspot missense mutations across cancer types which may fit what you are looking for are FBXW7 (residues 465, 479, 505, etc.), PIK3CA (1047, 542, 545, etc.), KRAS (12, 13, and 61), HRAS (12, 13, and 61), BRAF (600, etc.), and TP53 (many residues). You can query your gene by using the following format for the URL:, for example KRAS would be Looking at the "TCGA 3D Mutation Hot Regions" column, you can see which cancer types have hotspots. Clicking the "+" button will show you how many hotspot regions there are. Hovering over each hotspot region, for example "hr_1" will pop up a tool tip telling you which residues have statistically significant hotspot mutations. To show the hotspot region on the protein structure, click on the hotspot region (e.g. "hr_1"). One inconvenience, though, is that sometimes protein structures have different residue numbering conventions than typically used by the gene, for example some protein structures of BRAF may list a hotspot at residue 599 rather than 600, but it is actually the same residue. You can switch between protein structures by clicking on a different PDB ID in the left column. The benefit of looking at hotspots on protein structures is that it may give you a clue about the functional effect of the missense mutation. You could then follow Cyriac's suggestion of using cBioPortal for obtaining the mutation data for the relevant hotspot residues.

ADD COMMENTlink modified 11 weeks ago by RamRS25k • written 4.2 years ago by Collin700

Every one in this question has given great answers, and you are definitely no exception. I like this link!

ADD REPLYlink modified 11 weeks ago by RamRS25k • written 4.2 years ago by gooshiroy20
gravatar for Giovanni M Dall'Olio
4.2 years ago by
London, UK
Giovanni M Dall'Olio26k wrote:

To have a quick look, you can check ICGC data portal instead of TCGA. Just click on 'Genes' and you will get the list of most frequently mutated genes across all ICGC samples. Obviously, the most commonly mutated gene in cancer is TTN! Which is a well known false positive. I recommend you to click the filter "Curated Gene Set", to remove all these false positive.

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by Giovanni M Dall'Olio26k

This link is very, very helpful. Thank you.

ADD REPLYlink modified 11 weeks ago by RamRS25k • written 4.2 years ago by gooshiroy20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1090 users visited in the last hour