Any recommendations for improving Base Quality Score Recalibration (BQSR) for non-model organisms?
0
0
Entering edit mode
4.6 years ago
elcortegano ▴ 200

I just learned about this Base Quality Score Recalibration (BQSR) step, which seems to be really important for variant calling and that seems to be highly determined by the size of the variant database used (eg. see this conference paper).

I'm wondering how could I run BQSR for my data given that I'm not using humans or any other organism with a public database for SNPs or variants. Should I just generate a cvf file with a program such as GATK HaplotypeCaller and use it as database, or are there any other "best practices" for this?

For example, if different species were sequenced with the same technology, it would be safe to construct a database using data from all of them assuming that sequencing errors in these will be purely due to technical errors?

Thank you!

bqsr next-gen variant-calling • 1.6k views
ADD COMMENT
1
Entering edit mode

By best knowledge, there is yet a paper to demonstrate that this step is truely necessary and/or beneficial, beyond of what the Broad Institute recommends. Also as you correctly note, you need a well-curated reference set of SNPs to properly run it. If you don't have that you might simply not do it at all. See e.g. here https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1279-z#Abs1. I suggest you browse the literature on benchmarkings towards this method and then decide if it is worth the additional effort.

ADD REPLY

Login before adding your answer.

Traffic: 3001 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6