Question: pre-processing whole genome data
gravatar for br.tania
22 months ago by
br.tania40 wrote:

Hi everyone!

I am new to whole genome analyses. For guidance, I am referring to 'Best Practices for Germline SNP & Indel Discovery in Whole Genome and Exome Sequence'.

As of now, I need to call variants in Rhesus Macaque paired-end reads (fasta files).

I used the most recent reference genome available to map them using BWA. Then, duplicates were marked using Picard. The next step is supposed to be: Recalibrate Base Quality Scores. According to this link ( ), it consists of four sub-steps. The command for the first sub-step is suggested to be the following:

java -jar GenomeAnalysisTK.jar \ -T BaseRecalibrator \ -R reference.fa \ -I input_reads.bam \ -L 20 \ -knownSites dbsnp.vcf \ -knownSites gold_indels.vcf \ -o recal_data.table

My question is about the -knownSites options here.

Is a vcf file listing the known sites available for all organisms? At the NCBI website, I do see that the information (several known SNPs) is there for Macaca mulatta but I am unable to figure out how to obtain it in a vcf format as such.

I would appreciate any sort of enlightening inputs.

Thanks in advance!

rhesus macaque gatk dbsnp.vcf • 747 views
ADD COMMENTlink modified 22 months ago by genomax64k • written 22 months ago by br.tania40

Take a look at this GATK thread for additional information.

ADD REPLYlink written 22 months ago by genomax64k

Thanks! I will give a feedback once I try out the suggestions. I actually later also stumbled upon the ncbi repertoire of dbSNPs for macaques.

ADD REPLYlink written 22 months ago by br.tania40

The BBMap package has a faster and easier and option for recalibration, which does not need known sites... Usage: in=mapped.bam ref=reference.fa ploidy=2 callvariants in=mapped.bam out=recalibrated.bam recalibrate
ADD REPLYlink written 22 months ago by Brian Bushnell16k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 935 users visited in the last hour