Does Whole Exome Sequencing Include Mitochondrial Genome??
2.6 years ago
DNAngel

I have exome sequencing data and as far as I know, it should include all exonic regions including the mitochondrial genome. When I blast my sequences to recover mitochondrial protein-coding genes, I get a ton of intermittent stop codons and a lot of gaps which I generally do not find when extracting nuclear genes. The literature tells me it includes all exomic region so I am wondering if this is just a coverage problem or something else? I don't work with mitogenomes generally but after extracting the popular COI gene exons, and aligning everything to the reference sequence the columns of stop codons makes no sense to me.

Why don't you check the coverage of the mitochondrial genes (those having a reasonable non-multimapping MAPQ like > 20 or so) to see if those genes are included.

As you probably know there are different kits for WES. So please be as specific as possible. For each kit the design files should be available, although some will be easier to find than others. In general I would expect those genes to be included... Although targeting these might lead to severe unbalanced coverage...

2.6 years ago
h.mon

As noted by WouterDeCoster , you should state the capture kit (and search for its documentation), as different kits will include different capture probes. Illumina exome kits apparently include mitochondrial genes.

What you may be observing is the presence of numt (nuclear copies of mtDNA), see “COI-like” Sequences Are Becoming Problematic in Molecular Systematic and DNA Barcoding Studies for an introduction on numts. The 1000 Genomes project (A global reference for human genetic variation) found that, on average, a typical human genome has 4 numts.

2.6 years ago

The short answer is no. I don’t believe most exome kits really have intentional mitochondrial sequence because if they did there would be mt variants in projects like ExAC, but there aren’t. Maybe that’s a new development for illumina kits. I looked at the TruSeq Exome Targeted Regions Manifest v1.2 (BED Format), and I did see these regions:

Mitochondria outnumber nuclear dna like 1000 to 1, but the chemistry must be sufficiently different to keep these sequences out of exomes kits, because we’re only talking about a 13 coding genes anyway.

If you are dead set on exomes, there are off-target reads (accidents) which can tell you a lot.