8.0 years ago by
Vancouver, BC, Canada
I would argue that no, you cannot (reliably) analyze anything that aligns outside the targeted regions in exon capture data. If you see a dense region of coverage outside a targeted region, one of two things has happened:
a) the reads really came from there and they were capture due to off-target binding
b) the reads really came from somewhere else and were misaligned.
The makers of the exon cap kit went through great care to reduce the chance of off-target binding, so I would expect that almost everything you see will be the result of a misalignment. Because the aligner is just following a set of rules, you will often get large collections of reads that have all been systematically aligned to the wrong place. Furthermore, due to differences between the human genome reference and the actual genome of the individual that was sequenced, it can be impossible to tell if a read was properly aligned or not.
That being said, I have noticed that the coverage often spills out ~100bp on either side of the bounds of the officially targeted region, so if your promoter of interest is super close to a targeted region, you may be in luck.
btw In case you are not convinced that you can't tell good alignments from bad ones, consider the following case. The aligner says "this read maps perfectly to exactly one location". In fact the gene it aligned to has a paralog which differs by one base. Furthermore, the aligner can't know this, but in reality the individual you sequenced does not have the snp that distinguishes the two paralogs in the canonical reference. So in the end it should have really mapped ambiguously because it aligns to two locations equally well.
8.0 years ago by
Nina • 340