Question: problem with vcf file by gatk
0
gravatar for evelyn
8 months ago by
evelyn90
evelyn90 wrote:

Hello,

I am making vcf files using gatk with following codes:

java -jar $PICARD_DIR/picard.jar ValidateSamFile \
I=example.sorted.bam \
MODE=SUMMARY
java -jar $PICARD_DIR/picard.jar AddOrReplaceReadGroups \
I=example.sorted.bam \
O=example.gatk.sorted.bam \
RGID=1 \
RGLB=lib1 \
RGPL=illumina \
RGPU=unit1 \
RGSM=20
java -jar $PICARD_DIR/picard.jar ValidateSamFile \
I=example.gatk.sorted.bam \
MODE=SUMMARY
samtools index example.gatk.sorted.bam
gatk --java-options "-Xmx4G" HaplotypeCaller -R ref.fa -I example.gatk.sorted.bam -O example.gatk.vcf

But the vcf it gives does not give sample name in header line instead it gives a number 20 which I am not sure why:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  20

I want the sample name instead of 20 because I want to use the vcf files for further analysis where I am getting an error of duplicate names as all the vcf files have 20 instead of sample names.

I will appreciate any help. Thank you!

snp • 232 views
ADD COMMENTlink modified 8 months ago by Satyajeet Khare1.5k • written 8 months ago by evelyn90
4
gravatar for Satyajeet Khare
8 months ago by
Satyajeet Khare1.5k
Pune, India
Satyajeet Khare1.5k wrote:

You are declaring sample name to be 20 (RGSM=20).

ADD COMMENTlink written 8 months ago by Satyajeet Khare1.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 752 users visited in the last hour