How do you validate called SNPs from NGS data?
1
2
Entering edit mode
9.5 years ago
mattbawn ▴ 60

I am new to bioinformatics in general and have the following situation:

I have just got 4 samples sequenced by Whole Exome Sequencing using Macrogen. I have received called variants from their bioinformatics pipeline as well as putting the generated fastq data from each sample through my own pipeline. I am looking to find a novel disease causing mutation in chromosome 2.

Amongst both pipelines a mutation a potentially interesting gene is called. However, the chromosomal coordinates are different. I understand that this is a somewhat probable situation but is there a way that I might infer that one location is more likely than the other?

I used GATK for variant calling and was thinking of using their Variant Quality Score Recalibration (VQSR) algorithms, but as this I believe, depends on previously determined SNPs I think is would bias against novel mutations.

Any ideas or suggestions would be appreciated.

sequencing SNP • 2.8k views
ADD COMMENT
3
Entering edit mode
9.5 years ago

You'll want to use the VQSR, since it'll decrease the false-positive rate. It's not so much that this biases against novel calls, but rather it uses the information gleaned from known sites to better gauge what's required for a real call.

For actual validation, you want to use an orthogonal technology (e.g., Sanger sequencing would suffice for a single gene).

ADD COMMENT

Login before adding your answer.

Traffic: 3131 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6