Question: A Question About Annotation To Exome Seq
gravatar for camelbbs
7.7 years ago by
camelbbs670 wrote:


While I use annovar to annotate the exome seq data, I get two files annovar.variant and annovar.exonic_variant.

My question is If exome seq only focus in exonic region, why the annovar also get the info in other regions: intronic, intergenic, etc?


exome sequencing • 2.8k views
ADD COMMENTlink modified 6.2 years ago by Biostar ♦♦ 20 • written 7.7 years ago by camelbbs670
gravatar for Matt Shirley
7.7 years ago by
Matt Shirley9.3k
Cambridge, MA
Matt Shirley9.3k wrote:

Most exon capture methods for enrichment prior to "exome" sequencing actually capture something more than what they target. This makes sense, if you consider the fact that the oligos that target specific sequences do not have to be anywhere near as large as the fragment of DNA they capture. There is a great review of three current capture platforms I would urge you to read. Also, "exonic" is defined by the regions you supply. If this is RefSeq genes, then it will be somewhat more conservative than something like UCSC genes.

ADD COMMENTlink written 7.7 years ago by Matt Shirley9.3k

This paper also has a comparison of WES vs WGS, for those who find such things interesting...

ADD REPLYlink written 7.7 years ago by Alex Paciorkowski3.4k

The annovar.variant result have 144k rows, but the annovar.exonic-variant result only have about 15k rows. So that's my doubts there.

Thanks. I will look up the papers.

ADD REPLYlink modified 7.7 years ago • written 7.7 years ago by camelbbs670

If you haven't done so already, I would use something like Picard to calculate your target region stats:

My guess is that you'll probably see a fair amount of off-target reads, especially if you remove duplicates (for example, I think you are doing pretty good of you get 60% on-target unique reads).

It is also worth taking into consideration your total number of reads. Let's say your on-target coverage is 80x versus 40x and your off-target coverage is 10x vs. 5x, respectively. Doubling the coverage probably results in the same on-target variants, but you will have much higher power to detect off-target variants.

ADD REPLYlink modified 6.2 years ago • written 6.2 years ago by Charles Warden7.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2245 users visited in the last hour