Question

sciClone package -> mosaic copy number correction of VAFs?

1

Entering edit mode

9.8 years ago

schmoo ▴ 30

Dear all,

I'm using the sciClone R package and got very exciting data with that! Now I am wondering if anyone has experience in attributing non-integer values to the copynumber segments in which the measured VAFs are located. Given that I am confident to measure an accurate copynumber value of 1.5 (meaning 50% of cells are deleted), the VAFs of variants in this segment would need to be corrected differently than VAFs in segments with CN=1 or CN=3. As far as I understand, VAFs in segments with copy numbers close to integer values get attributed those integer values, while VAFs in segments with intermediate CN (eg 1.5) will be dropped, although they would nicely contribute to specific mutation clusters, especially of they consist of very few mutations. I'd be glad to hear some opinions on that!

Best regards,

Max

sciClone copy number variation tumor heterogeneity • 4.4k views

ADD COMMENT • link updated 2.4 years ago by Ram 43k • written 9.8 years ago by schmoo ▴ 30

Ram · Answer 1 · 2014-07-16

With this version of sciClone, we made the explicit decision to not correct VAFs for copy number, and to instead exclude anything that is in a non-CN-neutral region. Correction is a problem that's more tricky than it sounds, when taking heterogeneity into account.

Consider the following example: I have a founding clone population at a VAF 50%, and two subclones at 36% and 18%. Now, I find a mutation with a VAF of 12% in a region of copy number 3. Is that two copies of the mutation, present in the 18% subclone, or a single copy of the mutation, present in the 36% subclone?

There are some tricks you can do with phasing the variants to help disambiguate some of these cases, and we're working on them for the next version, but in general, it's a difficult problem.

That said, if you are highly confident that your copy number calls are accurate, you can use each segment of CN alteration to generate pseudo-VAFs. So in your example, with a highly confident CN region of 1.5, you could add it into your input file as a point with VAF 25%, since you're assuming that it's a single copy deletion present in half of the tumor cells. If the region is large, you could insert multiple points with that same VAF, based roughly on the observed mutation rate per megabase in your tumor. We show an example of doing this in the sciClone paper, due out very soon in PLoS Computational Biology.