pre-processing whole genome data
0
0
Entering edit mode
7.0 years ago
br.tania ▴ 50

Hi everyone!

I am new to whole genome analyses. For guidance, I am referring to 'Best Practices for Germline SNP & Indel Discovery in Whole Genome and Exome Sequence'.

As of now, I need to call variants in Rhesus Macaque paired-end reads (fasta files).

I used the most recent reference genome available to map them using BWA. Then, duplicates were marked using Picard. The next step is supposed to be: Recalibrate Base Quality Scores. According to this link (https://software.broadinstitute.org/gatk/documentation/article?id=2801 ), it consists of four sub-steps. The command for the first sub-step is suggested to be the following:

java -jar GenomeAnalysisTK.jar \ -T BaseRecalibrator \ -R reference.fa \ -I input_reads.bam \ -L 20 \ -knownSites dbsnp.vcf \ -knownSites gold_indels.vcf \ -o recal_data.table

My question is about the -knownSites options here.

Is a vcf file listing the known sites available for all organisms? At the NCBI website, I do see that the information (several known SNPs) is there for Macaca mulatta but I am unable to figure out how to obtain it in a vcf format as such.

I would appreciate any sort of enlightening inputs.

Thanks in advance!

dbSNP.vcf gatk rhesus macaque • 1.8k views
ADD COMMENT
0
Entering edit mode

Take a look at this GATK thread for additional information.

ADD REPLY
0
Entering edit mode

Thanks! I will give a feedback once I try out the suggestions. I actually later also stumbled upon the ncbi repertoire of dbSNPs for macaques.

ADD REPLY
0
Entering edit mode

The BBMap package has a faster and easier and option for recalibration, which does not need known sites... Usage:

calctruequality.sh in=mapped.bam ref=reference.fa ploidy=2 callvariants
bbduk.sh in=mapped.bam out=recalibrated.bam recalibrate
ADD REPLY

Login before adding your answer.

Traffic: 2522 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6