Question

Classifying normal vs cancer tissues from mutational data

1

Entering edit mode

4.8 years ago

moustafa_abohawya ▴ 30

Hello all,

Now I am new to this kind of analysis, but I have certain tissue that should be cancerous, however, we need to make sure if it is actually tumor tissue or just normal adjacent tissue.

I have access to mutational data; SNPs, indels and deletions along with their frequencies in my sample.

Is there a tool which can take this as an input and gives an output for probabilities of this tissue to be cancer, normal and maybe assign it to a specific type of cancer?

I might also have an access to RNAseq data if needed.

Thanks so much anyway

SNP genome cancer classifcation RNA-Seq • 1.1k views

ADD COMMENT • link 4.8 years ago by moustafa_abohawya ▴ 30

1

Entering edit mode

Not an answer: it is incredibly difficult to distinguish normal tissue from cancer. As you can see in papers by e.g. Martincorena https://www.sanger.ac.uk/people/directory/martincorena-inigo many healthy human tissues have serious cancer driver mutations that will never develop into cancer. Moreover, some truly cancerous tissues do not require any actions (e.g. https://nationalinterest.org/blog/buzz/study-1-4-cancers-detected-men-were-overdiagnosed-2012-australia-118011 ) so this way is quite dangerous.

It can be seen even more precisely using RNAseq data - e.g. from Gaddy Getz paper https://science.sciencemag.org/content/364/6444/eaaw0726/tab-article-info or Muyas paper https://www.biorxiv.org/content/10.1101/687822v2

When you know that the tissue is cancerous, you can apply methods similar to https://www.nature.com/articles/nature14221 using mutational profiles. Methylation data is also a good predictor (however I believe it is more expensive).

ADD REPLY • link 4.8 years ago by German.M.Demidov ★ 2.9k

1

Entering edit mode

This issue should be resolved before any sequencing data is generated, typically by pathologists. As you say 'this tissue should be cancerous', then you should have that checked by going back to where that tissue was processed, and asking those who did that.

ADD REPLY • link 4.8 years ago by bruce.moran ▴ 970

0

Entering edit mode

Even that can be not enough, e.g. for paragangliomas - there is no test that can destinghish malignant from benign tumor

ADD REPLY • link 4.8 years ago by German.M.Demidov ★ 2.9k

2

Entering edit mode

Yes, with some GI polyps for example this is the case too, but the point holds that it is not really appropriate to generate sequencing data from any kind of pre- or tumourous material without first having a pathologist review it. You waste time and resources trying to answer this kind of question.

In terms of determining tumour type from mutation profile, you could also screen for known pathogenic mutations in for example the COSMIC cancer gene census list. But this provides no certainty that the tissue is tumour, only that some known variants appear in the tissue.

Could you tell us what the tissue is? And is this a hypothetical situation (e.g. for coursework/study) or have you actually been given this type of data in a work setting?

ADD REPLY • link 4.8 years ago by bruce.moran ▴ 970

score 1 · Answer 1 · 2020-01-30

Hi, Is the cancer you are suspecting, a known type? If yes then it should be somewhat easier as you could check if the any of the known driver mutations have been detected. More importantly though, the mutational data that you have, is that somatic or obtained from single sample variant calling? Detecting a known driver/ pathogenic mutation in somatic mode (i.e. suspect tissue compared to a control/ normal tissue) makes it more likely that the 'suspect' tissue could be cancerous. If the cancer type you are suspecting is not known, or if the cancer type is such that it has low mutation burden (so less chance of picking up any mutation), or if you do not have good tumour purity estimate (like from immunohistochem. or H&E stain etc.) ==> Any or combination of these would make getting an effective resolution of your query difficult. Annotate your variant calls using tools like VEP while using the annotation resource of gnomAD . Read the VEP docu. for how to use gnomAD resource as a custom plugin. If you are seeing variants that are non-polymorphic (mean not known in gnomAD, or if known then at very low freq., say < 10E-3 or 10E-5) in cancer genes, that could be another indication of 'suspect' tissue being cancerous. Look for stop-codon gain or frameshift mutations (that pass gnomAD filter) in known tumour suppressor genes. Thats would be a low-hanging fruit.