Error details: Read FCC03A6ABXX:3:2107:11142:198335#TAGCTTAT is missing the read group (RG) tag
0
0
Entering edit mode
8.3 years ago
zwang10 ▴ 30

I am new to bam file and GATK tools. I want to convert bam into vcf by running

java -jar /media/zwang10/Elements/UK10K/GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar \
  -R /media/zwang10/Elements/UK10K/human_g1k_v37.fasta \
  -T HaplotypeCaller \
  -I _EGAR00001038931_36843.pe.raw.sorted.bam \
  --genotyping_mode DISCOVERY \
  -stand_emit_conf 10 \
  -stand_call_conf 30 \
  -o raw_variants.vcf

But I got

INFO  19:11:19,792 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  19:11:19,798 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.5-0-g36282e4, Compiled 2015/11/25 04:03:56 
INFO  19:11:19,798 HelpFormatter - Copyright (c) 2010 The Broad Institute 
INFO  19:11:19,799 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
INFO  19:11:19,807 HelpFormatter - Program Args: -R /media/zwang10/Elements/UK10K/human_g1k_v37.fasta -T HaplotypeCaller -I _EGAR00001038931_36843.pe.raw.sorted.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o raw_variants.vcf 
INFO  19:11:19,820 HelpFormatter - Executing as zwang10@zwang10-K55N on Linux 3.13.0-74-generic amd64; OpenJDK 64-Bit Server VM 1.7.0_91-b02. 
INFO  19:11:19,821 HelpFormatter - Date/Time: 2016/01/03 19:11:19 
INFO  19:11:19,822 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  19:11:19,823 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  19:11:20,220 GenomeAnalysisEngine - Strictness is SILENT 
INFO  19:11:20,537 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 500 
INFO  19:11:20,554 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
INFO  19:11:20,783 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.23 
INFO  19:11:20,887 HCMappingQualityFilter - Filtering out reads with MAPQ < 20 
INFO  19:11:21,120 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files 
INFO  19:11:22,436 GenomeAnalysisEngine - Done preparing for traversal 
INFO  19:11:22,437 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] 
INFO  19:11:22,439 ProgressMeter -                 |      processed |    time |         per 1M |           |   total | remaining 
INFO  19:11:22,440 ProgressMeter -        Location | active regions | elapsed | active regions | completed | runtime |   runtime 
INFO  19:11:22,441 HaplotypeCaller - Disabling physical phasing, which is supported only for reference-model confidence output 
INFO  19:11:22,562 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values. 
WARN  19:11:22,563 InbreedingCoeff - Annotation will not be calculated. InbreedingCoeff requires at least 10 unrelated samples. 
INFO  19:11:22,565 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values. 
INFO  19:11:22,930 HaplotypeCaller - Using global mismapping rate of 45 => -4.5 in log10 likelihood units 
INFO  19:11:27,999 GATKRunReport - Uploaded run statistics report to AWS S3 
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 3.5-0-g36282e4): 
##### ERROR
##### ERROR This means that one or more arguments or inputs in your command are incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR
##### ERROR If the problem is an invalid argument, please check the online documentation guide
##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
##### ERROR
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
##### ERROR
##### ERROR MESSAGE: SAM/BAM/CRAM file htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter@51762faf is malformed. Please see http://gatkforums.broadinstitute.org/discussion/1317/collected-faqs-about-input-files-for-sequence-read-data-bam-cramfor more information. Error details: Read FCC03A6ABXX:3:2107:11142:198335#TAGCTTAT is missing the read group (RG) tag, which is required by the GATK. Please see http://gatkforums.broadinstitute.org/discussion/59/companion-utilities-replacereadgroups to fix this problem
##### ERROR ------------------------------------------------------------------------------------------
zwang10@zwang10-K55N:/media/zwang10/Elements/UK10K$ java -jar /media/zwang10/Elements/UK10K/GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar -R /media/zwang10/Elements/UK10K/human_g1k_v37.fasta -T HaplotypeCaller -I _EGAR00001038931_36843.pe.raw.sorted.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o raw_variants.vcf^C
zwang10@zwang10-K55N:/media/zwang10/Elements/UK10K$ java -jar /media/zwang10/Elements/UK10K/GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar -R /media/zwang10/Elements/UK10K/human_g1k_v37.fasta -T HaplotypeCaller -I _EGAR00001038931_36843.pe.raw.sorted.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o raw_variants.vcf > error
INFO  19:14:00,776 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  19:14:00,783 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.5-0-g36282e4, Compiled 2015/11/25 04:03:56 
INFO  19:14:00,784 HelpFormatter - Copyright (c) 2010 The Broad Institute 
INFO  19:14:00,785 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
INFO  19:14:00,793 HelpFormatter - Program Args: -R /media/zwang10/Elements/UK10K/human_g1k_v37.fasta -T HaplotypeCaller -I _EGAR00001038931_36843.pe.raw.sorted.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o raw_variants.vcf 
INFO  19:14:00,806 HelpFormatter - Executing as zwang10@zwang10-K55N on Linux 3.13.0-74-generic amd64; OpenJDK 64-Bit Server VM 1.7.0_91-b02. 
INFO  19:14:00,807 HelpFormatter - Date/Time: 2016/01/03 19:14:00 
INFO  19:14:00,808 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  19:14:00,808 HelpFormatter - -------------------------------------------------------------------------------- 
INFO  19:14:01,199 GenomeAnalysisEngine - Strictness is SILENT 
INFO  19:14:01,500 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 500 
INFO  19:14:01,517 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
INFO  19:14:01,668 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.15 
INFO  19:14:01,739 HCMappingQualityFilter - Filtering out reads with MAPQ < 20 
INFO  19:14:01,982 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files 
INFO  19:14:03,265 GenomeAnalysisEngine - Done preparing for traversal 
INFO  19:14:03,266 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] 
INFO  19:14:03,268 ProgressMeter -                 |      processed |    time |         per 1M |           |   total | remaining 
INFO  19:14:03,269 ProgressMeter -        Location | active regions | elapsed | active regions | completed | runtime |   runtime 
INFO  19:14:03,270 HaplotypeCaller - Disabling physical phasing, which is supported only for reference-model confidence output 
INFO  19:14:03,390 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values. 
WARN  19:14:03,391 InbreedingCoeff - Annotation will not be calculated. InbreedingCoeff requires at least 10 unrelated samples. 
INFO  19:14:03,393 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values. 
INFO  19:14:03,675 HaplotypeCaller - Using global mismapping rate of 45 => -4.5 in log10 likelihood units 
INFO  19:14:08,680 GATKRunReport - Uploaded run statistics report to AWS S3 
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 3.5-0-g36282e4): 
##### ERROR
##### ERROR This means that one or more arguments or inputs in your command are incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR
##### ERROR If the problem is an invalid argument, please check the online documentation guide
##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
##### ERROR
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
##### ERROR
##### ERROR MESSAGE: SAM/BAM/CRAM file htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter@5c0bb1d5 is malformed. Please see http://gatkforums.broadinstitute.org/discussion/1317/collected-faqs-about-input-files-for-sequence-read-data-bam-cramfor more information. Error details: Read FCC03A6ABXX:3:2107:11142:198335#TAGCTTAT is missing the read group (RG) tag, which is required by the GATK. Please see http://gatkforums.broadinstitute.org/discussion/59/companion-utilities-replacereadgroups to fix this problem
##### ERROR ------------------------------------------------------------------------------------------

Is there a way to add the missing RG tag?

bam GATK • 3.2k views
ADD COMMENT
2
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 2013 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6