Question: TCGA driver mutation data
gravatar for aksam
24 days ago by
aksam10 wrote:

I would like to download driver mutation data for TCGA patients ( in particular lung cancer but ideally 'pan cancer'.

For example I would like to be able to discover the proportions of patients with adenocarcinoma of the lung who have driver mutations in KRAS, EGFR, TP53 etc etc.

I came across this paper - 'Comprehensive Characterization of Cancer Driver Genes and Mutations' ( where they produced a database of 9423 exomes annotated with various putative drivers. Does anyone know how to access their dataset? I can't seem to find any instructions.

If not, which source would you recommend to get such data from (i.e. driver mutation data), and why

Thanks in advance

genome • 121 views
ADD COMMENTlink modified 24 days ago • written 24 days ago by aksam10
gravatar for Hamid Ghaedi
24 days ago by
Hamid Ghaedi710
Hamid Ghaedi710 wrote:

There are a number of different algorithms that try to identify driver mutation in cancer mutation data diverse approaches. Check an update one: MutPanning (v2.0) from Dana-Farber Cancer Institute and Broad Institute. Here is the paper. You can find a sort of benchmarking in the paper also.

ADD COMMENTlink written 24 days ago by Hamid Ghaedi710

My understanding is that question refers to driver mutations. MutPanning is a gene-based method. Identifying which specific mutations within those genes are actually driver mutations is a much harder task, as driver genes contain a mixture of passenger and driver mutations.

ADD REPLYlink written 23 days ago by Collin860

Many clinical interpretation guidelines clearly delineate that missense mutation in a known disease gene is not sufficient evidence in of it self to be labeled oncogenic/pathogenic.

ADD REPLYlink written 23 days ago by Collin860

Thank you for this. It may be useful as a complementary resource to Collin's - will take a look

ADD REPLYlink modified 19 days ago • written 19 days ago by aksam10
gravatar for Collin
23 days ago by
United States
Collin860 wrote:

I'm one of the first authors of that paper.

The data is available on the Genomic Data Commons website for our paper ( Please see the file described: "Mutation Scores and tool aggregation" (Mutation.CTAT.3D.Scores.txt). It contains scores for all missense mutations (~750k mutations).

To get the filtered dataset, you only need to filter based on the flag column for each of CTAT-population ("New_Linear (functional) flag"), CTAT-cancer ("New_Linear (cancer-focused) flag"), and structural clustering ("New_3D mutational hotspot flag"). By convention, a value of "1" indicates a flag for a potential driver mutation according to that approach. The 3,437 number is from any mutation with at least two of the approaches agreeing. The raw scores for CTAT cancer and CTAT population are found in columns "eigenscore (cancer)" and "eigenscore (functional)", respectively.

ADD COMMENTlink written 23 days ago by Collin860

For loss-of-function mutations in tumor suppressors, you might look at the genes annotated as tumor suppressors in Table S1. Most variant annotation databases regard frameshift indels, nonsense mutations, essential splice site, stop loss or start loss mutations as likely oncogenic in tumor suppressor genes.

ADD REPLYlink written 23 days ago by Collin860

Lastly, if you also want to predict driver missense mutations in new tumor samples outside of the TCGA, you could try CHASMplus ( ). The results were highly consistent with our results from the TCGA pancanatlas study, but substantially simplifies the scoring process (available via OpenCRAVAT, ).

ADD REPLYlink modified 23 days ago • written 23 days ago by Collin860

This is great - I didn't know that resource page for TCGA existed. This resource/paper is very useful because, as you say, the leap from variants within genes to annotation of 'driver' is difficult - thank you!

ADD REPLYlink written 19 days ago by aksam10

Glad to help. Hopefully this can also help anybody else that had the same question as you.

ADD REPLYlink written 19 days ago by Collin860

Insightful Collin. Thanks

ADD REPLYlink written 19 days ago by Hamid Ghaedi710
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1710 users visited in the last hour