Question: VariantRecalibrator Error message
1
gravatar for cvu
4.4 years ago by
cvu120
India
cvu120 wrote:

hi,

I'm using VariantRecalibrator from GATK. I've generated my vcf files with Mpileup/bcftools.

when i am using VariantRecalibrator, with this argument,

 java -Xmx4g -jar GenomeAnalysisTK-3.1-1/GenomeAnalysisTK.jar -T VariantRecalibrator -R GRch38.fasta -input filtered_cano.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=6.0 00-All.vcf -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ -mode BOTH -recalFile cano.recal -tranchesFile cano.tranches -rscriptFile cano.plots.R

it is throwing this error message :

##### ERROR A USER ERROR has occurred (version 3.1-1-g07a4bf8):
##### ERROR This means that one or more arguments or inputs in your command are incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR MESSAGE: The provided VCF file is malformed at approximately line number 10161196: unparsable vcf record with allele B

Please suggest me, if i am missing out something in arguments?

I also assume that, GATK doesn't take vcf file, which is generated from samtools.

Thanks!!!!

 

 

 

 

ADD COMMENTlink modified 3.9 years ago by Biostar ♦♦ 20 • written 4.4 years ago by cvu120
1

GATK can take VCF file. Perhaps vcf is the only format it accepts for the Recalibration. It clearly says that  the error is with the VCF file and not the arguments.

 ##### ERROR MESSAGE: The provided VCF file is malformed at approximately line number 10161196: unparsable vcf record with allele B

Paste the line number 10161195,10161196,10161197 here. 

ADD REPLYlink written 4.4 years ago by Ashutosh Pandey11k

hi,

3    16902883    rs56708014    B    BGC    .    .    RS=56708014;RSPOS=16902887;dbSNPBuildID=129;SSR=0;SAO=0;VP=0x050000080005000002000200;WGT=1;VC=DIV;INT;ASP;OTHERKG


recognized the error. There was a "B" in REF and ALT column in my dbSNP vcf file.

Thanks

is that possible if i only use dbSNP as resource or i need to give all three resources (hapmap, omni and dbsnp) ?

ADD REPLYlink written 4.4 years ago by cvu120

only dbSNP will work too !!

ADD REPLYlink written 4.4 years ago by always_learning930

i tried with dbSNP but it is asking for some training=true dataset !!

ADD REPLYlink written 4.4 years ago by cvu120
1
gravatar for Cyriac Kandoth
4.2 years ago by
Cyriac Kandoth5.2k
Memorial Sloan Kettering, New York, USA
Cyriac Kandoth5.2k wrote:

GATK doesn't play nice with IUPAC codes, so you'll need to change the B to an N in the dbSNP VCF.

B denotes that it's either a C or G or T at that locus i.e. they've decided the allele is rarely, if ever, an A at that locus. IUPAC codes are definitely more information than just tagging anything ambiguous as an N.

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by Cyriac Kandoth5.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1172 users visited in the last hour