Help with "sciClone" process and output
5.6 years ago
venu 6.8k

Dear all,

I need to find the subclonality information of a cancer cell line exome data. For this purpose I am using sciClone. From previous threads I came to know that this tool can work with VAF information only (without providing the copy number information). However in the intermediate file created during the workflow, I can see it has assigned 'copy number 2' to every variant. I would like to know

• based on which information it has assigned value 2 as copy number to every variant?
• Is it a good idea to find subclonality information with the novel variants (after removing all the variants from public databases viz. dbSNP, 1KG, COSMIC) of a cell line?
5.6 years ago
• If you don't provide copy number information, sciClone has no way of knowing whether variants are in copy-numbered altered regions (and as a result, have altered VAFs). For best results, provide CN information. From WGS, you can use copyCat, or VarScan2 works well for exome CN calling.

• If you're calling somatic variants from a tumor sample with a matched normal, then there's no real reason to exclude dbSNP variants (unless you suspect that they're false positives or artifacts). If you do not have a matched normal, you're going to be unable to determine which variants are somatic and which are rare germline events, and you will likely be unable to do good clonal inference.

Thanks Chris. I do not have the matched normal for cell line (big obstacle). I've taken all the precautions which we need to take when we don't have matched normal, suggested by the forum and other resources. So I've removed common variants from above mentioned databases. However sciclone results are correlating with the experimental results from the lab (This time I'll provide CN info and confirm). I'm stuck with what you said 'which variants are somatic and which are rare germline events'. Is there any way to resolve this problem?

Nope. If you've filtered for population variants and such, then you've done about all you can do.

This a rare case when a very impure tumor might actually help. With sufficient depth of sequencing, you might be able to distinguish the peak of germline variants at 50% from the somatic variants at much lower frequencies. It's going to be hit or miss on whether that will work, though.