I was comparing the percent of FLT3 genes in AML from TCGA data with the mutations found in our exome data. Our exome data only reveals less than 50% of FLT3 mutations compared to TCGA samples. Does anyone know what capture method TCGA consortium used for AML samples? Is there any reason why we can't see half of FLTs in our cohort compared to TCGA? Also, what could be the reasons that all mutations in FLT3 can't be determined by exome methods?
tl;dr: You should definitely not expect to recover all FLT3 mutations from exome sequencing and stock variant calling.
I was the lead analyst for the genomics in the AML TCGA paper. Finding all of those FLT3 ITDs was a giant pain in the ass. At the very least, you should also use the targeted validation data. Even, then, I wouldn't expect to find everything. We did some targeted 454, and even some Sanger, I think. Even if you have good coverage of the region, variant callers generally suck at finding the ITD, because of the short reads and repetitive sequence. Many required manual inspection to resolve.
There are some additional tips in this previous thread: Identifying FLT3-ITD with Pindel Specifically, I'd have someone sit down with every sample and pull up that region, looking carefully for soft-clipping, etc that will help you identify the event. Good luck!