Question: Driver mutation detection benchamrk
0
gravatar for banerjeeshayantan
14 months ago by
banerjeeshayantan170 wrote:

Usually there are a lot of studies that have developed new methods to find candidate cancer driver genes. I have developed a new driver mutation detection algorithm. Is there any way to test my tool on some benchmarking datasets and compare it against various mutation detection algorithms already out there. I am interested only in gold standard driver mutation datasets. (Not driver gene datasets). Can you point me to any research articles or give some idea on how to test my new tool?

ADD COMMENTlink modified 14 months ago by Charles Warden7.9k • written 14 months ago by banerjeeshayantan170

Based on what kind of data? Expression, histone marks, open chromatin?

ADD REPLYlink written 14 months ago by ATpoint39k

Based on point mutations/INDELS/CNA etc. Basically the model that I have developed is based on the COSMIC mutation data. But the labels in COSMIC for each mutation (driver/passenger) is again based on the predictions of some tool (namely FATHMM). I want to test my model on known cancer driver mutations

ADD REPLYlink written 14 months ago by banerjeeshayantan170

I thought that we had already identified the driver genes behind the majority of cancers ... (?)

ADD REPLYlink written 14 months ago by Kevin Blighe65k

Are there databases that list the mutations (driver/passenger) in each of these known cancer driver genes? If so,are there studies that have taken these known driver/passenger mutations and listed their accuracy in identifying them? I want to compare my model against theirs

ADD REPLYlink written 14 months ago by banerjeeshayantan170
2

Perhaps, in this regard, one ought to consider the definition of what is a driver gene - I am yet to see a clear definition from a statistical standpoint, or anything that allows us to quantify / qualify a driver gene. Instead, the term 'driver' is used loosely to describe a gene that may be involved in cancer progression / promote tumourigenesis. Driver genes like TP53 are well known and documented and have clear roles in cancer progression. For most others, we have vast amounts of published data that shows their heightened expression in tumours; however, functional studies are required to prove each. Thus, even if you have developed some prediction algorithm, it is still in silico and will require functional validation, i.e., in the wet lab.

ADD REPLYlink written 14 months ago by Kevin Blighe65k

Thanks for your reply. So how do I find driver mutations/genes validated functionally inside a wet lab? Are there any resources?

ADD REPLYlink written 14 months ago by banerjeeshayantan170
1

I think this is not standardised in terms of databases that store these information. You would need to read papers and find information manually.

ADD REPLYlink written 14 months ago by ATpoint39k
0
gravatar for Charles Warden
14 months ago by
Charles Warden7.9k
Duarte, CA
Charles Warden7.9k wrote:

While I think some information has greater confidence than others, I'm not sure if "gold standard" is absolutely the best word.

My opinion is that having access to specialized knowledge is probably important. For example, for cancer, here are a couple gene-specific resources:

IARC TP53 Database: http://p53.iarc.fr/

BRCA Exchange: https://brcaexchange.org/

While not cancer related, there is also the CFTR2 reference for cystic fibrosis (which I learned from BioStars). I would tend to emphasize ClinVar (which has a star system for confidence), although that may not be perfect for all diseases. There is also the COSMIC database, but lack of being in the COSMIC database doesn't mean the variant doesn't cause cancer (and I think the number of times a variant is observed is low, even though I very much appreciate data sharing to try and maximize information available for decision making).

ADD COMMENTlink written 14 months ago by Charles Warden7.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1039 users visited in the last hour