Question: Can germline vs somatic variants be distinguished by phasing when no control is available?
0
gravatar for amjad
3.3 years ago by
amjad20
Finland
amjad20 wrote:

I have cancer samples that lack germline control and I am interested to identify somatic variants as accurately as possible. I did filtering based on panel of normals and filtered out reported variants in public databases. However, I still have some suspicious cases that I want to resolve and I wonder if phasing can be used here.

Here is an IGV snapshot of an example where three nonsynonymous mutations are detected:

https://www.dropbox.com/s/235b3nl0ukbera0/igv_panel.png?dl=0

Can we know confidently based on phasing which of these three are somatic? If yes, is there a systematic way to do such analysis?

 

ADD COMMENTlink modified 3.3 years ago by donfreed1.4k • written 3.3 years ago by amjad20

What do you think Phasing means? I thought it meant determining the strand of heterozygous alleles, in which case your answer is "no, it doesn't have anything to do with somatic vs germline".

 

ADD REPLYlink written 3.3 years ago by karl.stamm3.4k
3
gravatar for donfreed
3.3 years ago by
donfreed1.4k
Mountain View, CA
donfreed1.4k wrote:

Yes, read-backed phasing can be used to identify subclonal (somatic) mutations from bulk tissue sequence data as long as the subclonal mutations are sufficiently close to clonal heterozygous (germline) mutations.

In your example, the middle mutation (orange) and the right mutation (green) are perfectly in phase, indicating that they are present in the same clonal population (probably germline heterozygous). However, the mutation on the left (blue) does not phase perfectly with the orange mutation, indicating the presence of at least two distinct clonal populations in the bulk tissue. The blue mutation probably arose somatically.

We wrote some code to automate the identification of these variants using simple heuristics: https://bitbucket.org/donald_freed/phase-mosaic

Caveat:

Using phasing, it is impossible to distinguish somatic mutations from germline or mosaic copy-number alterations.

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by donfreed1.4k

The blue shows up on the same read as the orange five times. The orange shows up with ref-allele at the blue's site twice, and the blue shows up with ref-allele at the orange's site never.  There isn't enough information to call it anything. Maybe if the reads were paired we would have more haplotype information.  

ADD REPLYlink written 3.3 years ago by karl.stamm3.4k

With a sequence error rate of 1%, we would expect to observe two of the seven reads supporting mosaicism once every ~500 sites, so I would personally feel ok calling the site mosaic. For his work Amjad will have to decide on his own false-positive rate. 

Edit: combinatorics...

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by donfreed1.4k

Thanks for the answer. Actually we are confident about the somatic origin of the blue one because we do have a related cancer sample that doesn't show that mutation. The confusion is about the other two and it's good to know that there is no way to confirm that.

ADD REPLYlink written 3.3 years ago by amjad20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1897 users visited in the last hour