Question

Detecting somatic variants in non-tumor tissue without normals

2

Entering edit mode

5 months ago

manfi ▴ 20

I'm new to genome analysis, so apologies if this is a basic question. My hypothesis is that in a particular disease, gene XYZ contains somatic mutations. I have performed targeted sequencing of gene XYZ with high coverage (300 - several 1000x) using Sureselect, which employs UMIs. I have sequenced the gene in brain tissue from diseased and non-diseased persons. Unfortunately, only brain tissue is available, so a brain to non-brain comparison isn't possible.

Now I'm wondering how I should perform the somatic variant analysis. As a start, I wanted to use Mutect2 and Varscan in tumor-only mode (I'm not working with tumors, but the situation is analogous) on individual samples and only "call" variants if they can be found by both tools to introduce some stringency. However, I've read in online discussions that tumor-only analyses without normals give poor results and are discouraged.

In theory, I could use my non-diseased samples as normals. However, I don't even know whether gene XYZ is normal in the normal samples - maybe it can be randomly mutated in both and thus not associated with my disease. Also, I'm not sure that it's a good approach to match diseased individual A with healthy individual B; my understanding is that one usually matches tumor from individual A with healthy tissue from individual A to take germline mutations into account.

In summary, these are my questions:

is it advisable to run the tumor-only mode on every sample individually?
is it admissable to use different individuals as matched normals?
alternatively, could I build my own panel of normals, using data from non-diseased individuals? I have only got about 10 non-diseased, so this might not be a great PoN.

I'm also very open to other ideas. Thanks!

Edit: I've found this paper that lists a few tools for non-cancer data without matched controls Huang and Lee 2022

somatic brain tumor • 421 views

ADD COMMENT • link 3 months ago by manfi ▴ 20

score 3 · Accepted Answer · 2023-11-20

Yes, in general, if you are expecting variants in a subset (subclone) of the DNA from the many cells that you have sequenced, then you can use somatic variant callers to detect them. Though don't expect to (confidently) detect variants in less than 1 in 100 cells using standard DNA-seq with Illumina, unless you tagged your libraries with unique molecular identifiers (UMI) before sequencing.

Having a matched normal from the same patient only improves your specificity (which allows us to improve sensitivity at the expense of specificity). You only have one gene so don't worry too much about specificity right now. Try using MuTect2 or VarDict in tumor-only mode in default settings at first. If you don't detect any variants, then make the cutoffs more lenient (e.g. reduce cutoffs for VAF, DP, QUAL, etc.). If you find too many, then you'll need to read up "somatic variant filtering strategies" to shortlist the variants in ways that make sense for your hypothesis. This includes building a PoN using your non-diseased specimens.