Question: Sciclone on exome sequencing data
gravatar for chloe.steen
10 months ago by
chloe.steen120 wrote:

I have the following questions regarding sciClone and whole-exome sequencing data:

  1. Can it be used for exome sequencing data (and did someone manage to use it with WES and got decent results?)
  2. Should I make special considerations when making the VAF file from my exome sequencing results, like criteria for variant selection and so on
  3. Which tool can be used for best copy number prediction on exome seq data. My samples have both normal and tumor.
ADD COMMENTlink modified 4 months ago by Chris Miller17k • written 10 months ago by chloe.steen120

I've used sciClone on my exome data and got reasonable result. But in my case I don't have the matched normal. I don't think any special considerations are required other than sciclone's protocol. See this discussion, Chris has suggested CN prediction tools also.

ADD REPLYlink written 10 months ago by venu3.2k

OK, thank you for your reply. And was sciClone fast to generate output? When I tried for one sample it looked like it got stuck on the first step. I did not get any output other than "checking input data ..." for several hours. I made the VAF file from the Mutect VCF file, and I used ASCAT for copy number prediction after preprocessing the exome files, but I think I would rather use a tool made for exome sequencing data, hence my question. It is just that sciClone also recommended ASCAT.

ADD REPLYlink written 10 months ago by chloe.steen120
gravatar for Chris Miller
10 months ago by
Chris Miller17k
Washington University in St. Louis, MO
Chris Miller17k wrote:
  • Yes, we frequently use sciClone on exome sequencing data. As long as you have a reasonable number of mutations in your sample, there should be no issues.

  • There are no "special considerations" necessary, except to only include the variant calls that you believe are somatic.

  • No, sciClone should be quite fast (minutes to an hour at most). If it's taking longer than that, examing your sample a bit more. If you have both tens of thousands of mutations and poorly defined clusters, it will take longer. You might consider reducing the number of variants (and increasing your confidence in their position) by increasing the minimum depth variable.

  • Copy number calls can be made using the algorithm of your choice - VarScan, cn.mops, many others. I don't recall ever recommending ASCAT, except possibly to say that allele-specific assignments of CN information may be incorporated into future versions. It may work fine, but I have no personal experience running it.

ADD COMMENTlink modified 10 months ago • written 10 months ago by Chris Miller17k

Thanks Chris for the very quick response! I will try as you said, and hopefully it will work after I go through my variant calls, and use another tool for copy number prediction.

ADD REPLYlink written 10 months ago by chloe.steen120

I thought I would just include as a final comment in my post where I read about using ASCAT for WES in the Sciclone paper:

As with other tools [6], [11], [22], [30], regions of CNA and LOH are provided as inputs after having been inferred from whole-exome sequencing (WES, e.g., via ASCAT [...]

ADD REPLYlink modified 9 months ago • written 9 months ago by chloe.steen120

Does sciClone work over Targeted Panels samples as well ?

ADD REPLYlink written 4 months ago by always_learning680

Yes, but the more variants you have, the better you'll be able to define the clusters.

ADD REPLYlink written 4 months ago by Chris Miller17k
gravatar for chloe.steen
9 months ago by
chloe.steen120 wrote:

New question:

Does sciClone calculate/take into account tumor percentage when calculating VAF? Is there a way to give it as input?

The reason I ask if because my VAF plots are not scaled to tumor percentage, and it looks as though the VAF of a subclone increases between primary and relapse, but if you scale to tumor percentage it might actually be the same VAF at the two timepoints, it is just that one biopsy contains more tumor cell than the other.

I have information on tumor percentage for my samples, and I could of course upscale the VAFs myself, but I would rather have the nice sciClone plots, and I am also curious how sciClone handles tumor percentage.

ADD COMMENTlink written 9 months ago by chloe.steen120

The short answer is that it doesn't. There are several options for dealing with impure tumors:

  • zoom in by using the "xlim" and "ylim" params on sc.plot2d() to set the maximum VAF of the plot

  • make sure your copy number data is scaled appropriately (preferably during calling), or alter the copyNumberMargins parameter (or you may not excluding all CN events)

ADD REPLYlink modified 9 months ago • written 9 months ago by Chris Miller17k

I see that in the paper you published in Nature Communication you write the following:

Variant allele fractions of all tier 1 variants were corrected for purity by reducing the number of reference-supporting reads in proportion to the purity of the sample. This effectively scales up the VAFs in such a way that founding clone variants are near 50% VAF.

I couldn't find the code that actually did that, so I guess it was part of the pre-processing, but to be sure that I understand correctly, I thought I would give an example:

If you have 60 reference-supporting reads and a tumor purity of 70% (meaning 70% tumor, and 30% normal), what you do is that you multiply the nr of ref-reads by 0.70 or by 0.30?

ADD REPLYlink written 4 months ago by chloe.steen120

If you want to account for purity, it's simple - just alter the VAFs (but leave the readcounts alone! They carry information about the amount of error). So if your observed VAF is 20% and your purity is 0.7, then your corrected purity is 20/0.7 = 28.57%. Or if it's 25% VAF and 0.5 purity, you end up with 50%, just as you'd expect.

ADD REPLYlink written 4 months ago by Chris Miller17k

Ok, that's what I did, but the reason I was unsure if it was the right to do is because I am still getting VAFs larger than 50%. I was expecting VAFs smaller or equal to 50% at most.

Do you know what that can be due to? I am thinking that maybe there might be regions of LOH that were not included in my exclude.loh file. Do you think sciClone gives more accurate results if I exclude those variants with VAF > 50%?

ADD REPLYlink written 4 months ago by chloe.steen120

Some VAFs may be abit above 50% due to sampling error, but if there are a large number of them in a cluster clearly above 50%, then yes there is probably CN or LOH involved. Check your CN and LOH calls, and don't forget to scale them for purity as well.

ADD REPLYlink written 4 months ago by Chris Miller17k

I still get some VAFs above 50%, and I think they occur in regions with no CN calls, which are assumed by sciClone to be equal 2. One way to exclude them would be to add these regions in the exclude file. But since sciClone already detects those regions, wouldn't it be easier to add an option "excludeRegionsWithMissingCN" that can be set to TRUE when calling sciClone?

ADD REPLYlink written 3 months ago by chloe.steen120

CN regions leave symmetrical VAFs. So for every cluster at 66% (from 3x copy number), there's a corresponding region at 33% (from the strand that didn't get amplified). If you just exclude the points in that 66% cluster (because they're > 50%) then you leave the others behind and may infer a subclone where there is actually none.

The right answer is to go back and improve your CN calls. As with most algorithms, garbage in garbage out. It would be nice to add joint calling of CN and clonality, but that's a more difficult problem that I haven't had time to tackle. There are other packages out there that attempt to do so, but I can't offer specific recommendations, as I haven't benchmarked them.

ADD REPLYlink written 3 months ago by Chris Miller17k

so sciClone can handle internally for the CN gain regions as long as one feed it a segmentation file output by programs such as VarScan, cn.mops, but LOH regions needs to be specified explicitly by another LOH.exclude file?

ADD REPLYlink written 3 months ago by tangming20051.8k

Yes, SciClone will exclude CN regions given to it in that file. If there are LOH regions that are copy number neutral (CN2), then those will need to be excluded separately.

ADD REPLYlink written 3 months ago by Chris Miller17k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1505 users visited in the last hour