Question: Recalibration step from SNP calling
0
gravatar for mostafarafiepour
17 months ago by
mostafarafiepour60 wrote:

Hello everyone,

I am trying to run Recalibration stage from SNP calling for whole genome sequencing data. But, my reference genome do not have a known sites VCF file. So the -knownSites option is removed from my command line and i encounter the following error (Picture in Attachment):

My question is here, is it necessary for the reference genomes that do not have a known sites VCF file to perform the Recalibration step?

Cod i run:

java -jar /home/m.rafiepour222/GenomeAnalysisTK-3.8-1-0-gf15c1c3ef/GenomeAnalysisTK.jar -R /home/m.rafiepour222/GCF_000471725.1_UMD_CASPUR_WB_2.0_genomic.fa -T BaseRecalibrator -I /home/m.rafiepour222/1_BBKHU01_F/1_BBKHU01_F.sort.rmdup.bam -o /home/m.rafiepour222/1_BBKHU01_F/1_BBKHU01_F.grp enter code here

My Error:

As seen in the image, the error is associated with the same known sites VCF file...

enter image description here

snp • 573 views
ADD COMMENTlink modified 17 months ago by Pierre Lindenbaum122k • written 17 months ago by mostafarafiepour60

It can be skipped as discussed here (http://evodify.com/gatk-the-best-practice-for-genotype-calling-in-a-non-model-organism/). OP raised the similar issue (base recalibration using variant information for non-model organism) in gatk forum: https://gatkforums.broadinstitute.org/gatk/discussion/4164/base-re-calibration-when-i-dont-have-a-publicly-available-dbsnp. OP proposed a method. However it didn't work. Suggestion in OP's post (mentioned above) was to skip base recalibration. I guess same would work for Variant recalibration as both of them use variant information.

ADD REPLYlink modified 17 months ago • written 17 months ago by cpad011212k

many thanks for your reply,

yes, OP suggested a similar issue, but I think the proposal is both difficult and not working. I have tried a lot, but I could not get any result And this has caused me concern. I do not know what to do?

ADD REPLYlink written 17 months ago by mostafarafiepour60

OP suggested to skip base recalibration.

ADD REPLYlink written 17 months ago by cpad011212k

Is this subject approved by the GATK team ?? If confirmed, can you submit a document ?

ADD REPLYlink modified 17 months ago • written 17 months ago by mostafarafiepour60
1
gravatar for Pierre Lindenbaum
17 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum122k wrote:

https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_bqsr_BaseRecalibrator.php

Inputs

A BAM file containing data that needs to be recalibrated.

A database of known polymorphic sites to mask out.

you didn't provide "A database of known polymorphic sites to mask out. "

ADD COMMENTlink written 17 months ago by Pierre Lindenbaum122k

many thanks for your reply,

yes i didn't provide "A database of known polymorphic sites to mask out " , Because my reference genome do not have a known sites VCF file.

ADD REPLYlink written 17 months ago by mostafarafiepour60

then try to provide an empty VCF file as the database....

ADD REPLYlink written 17 months ago by Pierre Lindenbaum122k

an empty VCF file ? Sorry, I do not understand what you mean ?

ADD REPLYlink written 17 months ago by mostafarafiepour60

just the header of a VCF, no variant.

ADD REPLYlink written 17 months ago by Pierre Lindenbaum122k

I have tried to create a VCF file, but when I put it as an input file, i am again encountering error.

If possible, I request you to send me a VCF file (The same VCF, which is your opinion)??

ADD REPLYlink written 17 months ago by mostafarafiepour60

Excuse me, is the Recalibration stage really necessary or can it be ignored?

ADD REPLYlink written 17 months ago by mostafarafiepour60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1941 users visited in the last hour