How To Deal With Wild-Type Versus No-Calls In Exomes?
1
0
Entering edit mode
10.2 years ago
Chris Cole ▴ 800

In whole exome calls you only ever get results for positions that are variant when compared to the reference - invariant or wild-type positions are only inferred. This is fine when only looking at one sample. However, when comparing lots of samples to each other - either as part of a trio study or larger cohort analysis - it is not immediately obvious that a non-variant position is due to no data or a truly reference allele call.

This causes a problem where, say, a patient with a condition has a mutant allele at a position, but their parents have no calls at that position. That could either be because they are both WT for that allele, and the patient has a spontaneous mutation, or they have low/no read coverage at that position and so there is no call. In the latter case it still looks like the patient has a spontaneous mutation, but it could be that one or both parents have the allele and thus is a false-positive candidate mutation in the patient.

The only way I see of resolving this is to go back to the BAM files and check all invariant calls, where there is a candidate mutation in the patient, for read depth at that position.

Is anyone else looking at this? Or are there tools that do this already? Or do people just rely on confirmation with Sanger seq?

I'd be grateful for any comments or suggestions. Thks

exome human variant calling • 2.3k views
ADD COMMENT
2
Entering edit mode
10.2 years ago
User 59 13k

There are genotypers that do multi-sample SNP calling, GATK being a notable example. And if that's not your cup of tea, get your genotyper to emit all calls and not just variant ones.

I wouldn't do trio analysis without having well characterised reference positions in all three samples (as well as having a genotype quality and coverage threshold set). And assessing any exome sample for lack of coverage in a targeted capture region is good practice.

ADD COMMENT
0
Entering edit mode

Thanks for the reply.

Could you elaborate a bit on what you mean by 'well characterised reference positions', please? Do you mean known genotypes at cetain loci for all three samples? Or something else?

ADD REPLY
0
Entering edit mode

Known genotypes of good quality and coverage was what I was driving at, regardless of actual genotype, and yes in all 3 samples.

ADD REPLY

Login before adding your answer.

Traffic: 1993 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6