Question: Recalibration step from SNP calling
0
gravatar for mostafarafiepour
2.1 years ago by
mostafarafiepour70 wrote:

Hello everyone,

I am trying to run Recalibration stage from SNP calling for whole genome sequencing data. But, my reference genome do not have a known sites VCF file. So the -knownSites option is removed from my command line and i encounter the following error (Picture in Attachment):

My question is here, is it necessary for the reference genomes that do not have a known sites VCF file to perform the Recalibration step?

Cod i run:

java -jar /home/m.rafiepour222/GenomeAnalysisTK-3.8-1-0-gf15c1c3ef/GenomeAnalysisTK.jar -R /home/m.rafiepour222/GCF_000471725.1_UMD_CASPUR_WB_2.0_genomic.fa -T BaseRecalibrator -I /home/m.rafiepour222/1_BBKHU01_F/1_BBKHU01_F.sort.rmdup.bam -o /home/m.rafiepour222/1_BBKHU01_F/1_BBKHU01_F.grp enter code here

My Error:

As seen in the image, the error is associated with the same known sites VCF file...

enter image description here

snp • 893 views
ADD COMMENTlink modified 10 days ago by Biostar ♦♦ 20 • written 2.1 years ago by mostafarafiepour70

It can be skipped as discussed here (http://evodify.com/gatk-the-best-practice-for-genotype-calling-in-a-non-model-organism/). OP raised the similar issue (base recalibration using variant information for non-model organism) in gatk forum: https://gatkforums.broadinstitute.org/gatk/discussion/4164/base-re-calibration-when-i-dont-have-a-publicly-available-dbsnp. OP proposed a method. However it didn't work. Suggestion in OP's post (mentioned above) was to skip base recalibration. I guess same would work for Variant recalibration as both of them use variant information.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by cpad011213k

many thanks for your reply,

yes, OP suggested a similar issue, but I think the proposal is both difficult and not working. I have tried a lot, but I could not get any result And this has caused me concern. I do not know what to do?

ADD REPLYlink written 2.1 years ago by mostafarafiepour70

OP suggested to skip base recalibration.

ADD REPLYlink written 2.1 years ago by cpad011213k

Is this subject approved by the GATK team ?? If confirmed, can you submit a document ?

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by mostafarafiepour70
1
gravatar for Pierre Lindenbaum
2.1 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum128k wrote:

https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_bqsr_BaseRecalibrator.php

Inputs

A BAM file containing data that needs to be recalibrated.

A database of known polymorphic sites to mask out.

you didn't provide "A database of known polymorphic sites to mask out. "

ADD COMMENTlink written 2.1 years ago by Pierre Lindenbaum128k

many thanks for your reply,

yes i didn't provide "A database of known polymorphic sites to mask out " , Because my reference genome do not have a known sites VCF file.

ADD REPLYlink written 2.1 years ago by mostafarafiepour70

then try to provide an empty VCF file as the database....

ADD REPLYlink written 2.1 years ago by Pierre Lindenbaum128k

an empty VCF file ? Sorry, I do not understand what you mean ?

ADD REPLYlink written 2.1 years ago by mostafarafiepour70

just the header of a VCF, no variant.

ADD REPLYlink written 2.1 years ago by Pierre Lindenbaum128k

I have tried to create a VCF file, but when I put it as an input file, i am again encountering error.

If possible, I request you to send me a VCF file (The same VCF, which is your opinion)??

ADD REPLYlink written 2.1 years ago by mostafarafiepour70

Excuse me, is the Recalibration stage really necessary or can it be ignored?

ADD REPLYlink written 2.1 years ago by mostafarafiepour70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2336 users visited in the last hour