Question

Strategy To Make Cutoff For A Variant Calling Experiment Of Ngs

0

Entering edit mode

13.1 years ago

Jianfeng Mao • 0

We got our individuals (F1s, from crossing between reference genome and objective one) of a plant species sequenced by NGS method. Variants (snps and indels) were called for each of objective plant by these F1 individuals NGS data. Our data are haplotype data, a phased haplotype was called for one objective plant (one parent of the F1 individual).

For quality control, we employed several values: (1) concordance, (ratio of reads supporting a predicted feature to total coverage); (2) coverage, (how many reads supported this variant); (3) base quality, (base quality from the sequencing process).

Here, concordance may be the most important variable for quality control. The best variant calls determined by concordance are those have values of 0.5. Obviously, smaller ones (<0.1) and bigger ones (>0.9) are not good. Coverage may also play an important role, like the calls which have 0.5 vale for concordance and 0.1 coverage may not be the good calls. While, base quality may be the most intuitive quality control variable. The bigger base quality should be the calls which are better.

Here, I want to find a good strategy to set a cutoff to our variant calls based on these three or just concordance and coverage variables. I prefer a more statistical way.

Would you please give me any ideas/directions on my problems? Thanks in advance.

quality next-gen sequencing • 3.7k views

ADD COMMENT • link updated 12.7 years ago by Pablo ★ 1.9k • written 13.1 years ago by Jianfeng Mao • 0

0

Entering edit mode

You want to call variants or haplotypes? Your definition of concordance is a bit confusing, in this context it usually means similarity to known calls.

ADD REPLY • link 13.1 years ago by Casbon ★ 3.3k

0

Entering edit mode

I want to call haplotypes and the variants. I mean I can get variants and haplotype in the same time.

ADD REPLY • link 13.0 years ago by Jianfeng Mao • 0

score 2 · Answer 1 · 2011-04-01

2

Entering edit mode

13.1 years ago

Pablo ★ 1.9k

You might find this presentation useful (they use Ti/Tv for FDR)

http://www.broadinstitute.org/gsa/wiki/images/a/ac/Ngs_tutorial_depristo_1210.pdf

ADD COMMENT • link 13.1 years ago by Pablo ★ 1.9k

0

Entering edit mode

That is really a good guide on NGS data processing, not only quailty control. Thanks a lot, Pablo.

We intend to sequence a subset of the genome by sanger means. Then, use that as quality control.

ADD REPLY • link 13.0 years ago by Jianfeng Mao • 0