I need to find the subclonality information of a cancer cell line exome data. For this purpose I am using sciClone. From previous threads I came to know that this tool can work with VAF information only (without providing the copy number information). However in the intermediate file created during the workflow, I can see it has assigned '
copy number 2' to every variant. I would like to know
- based on which information it has assigned value 2 as copy number to every variant?
- Is it a good idea to find subclonality information with the novel variants (after removing all the variants from public databases viz. dbSNP, 1KG, COSMIC) of a cell line?
Thanks Chris. I do not have the matched normal for cell line (big obstacle). I've taken all the precautions which we need to take when we don't have matched normal, suggested by the forum and other resources. So I've removed common variants from above mentioned databases. However sciclone results are correlating with the experimental results from the lab (This time I'll provide CN info and confirm). I'm stuck with what you said 'which variants are somatic and which are rare germline events'. Is there any way to resolve this problem?
Nope. If you've filtered for population variants and such, then you've done about all you can do.
This a rare case when a very impure tumor might actually help. With sufficient depth of sequencing, you might be able to distinguish the peak of germline variants at 50% from the somatic variants at much lower frequencies. It's going to be hit or miss on whether that will work, though.