Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing SAM header. @RG line missing SM tag. Line: @RG ID:Sample1; File /path/to/BAM/Sample1.bam; Line number 1192
1
0
Entering edit mode
6.5 years ago
valopes ▴ 30

Hi everyone,

Guys, I've been struggling with this bioinformatic and now I have a new problem. So, I checked this post Error parsing SAM header. @RG line missing SM tag. Line: @RG ID:None and It didn't work to me.

First I did the mapping:

bwa mem -M -R '@RG\tID:Sample1' -t 6 genome.ref Sample1_R1_forward_paired.trim.fastq.gz Sample1_R2_forward_paired.trim.fastq.gz | samtools view -hSb -o Sample1.bam -

Then I sorted by coordinate:

samtools sort Sample1.bam -o Sample1_sorted.bam

I removed duplicates:

java -Xms4g -jar picard.jar MarkDuplicates INPUT=Sample1_sorted.bam OUTPUT=Sample1_sorted_rmdup.bam METRICS_FILE=Sample1_sorted_rmdup.txt2 REMOVE_DUPLICATES=true VALIDATION_STRINGENCY=LENIENT

Filtered per quality:

samtools view -hSbq 30 -o Sample1_sorted_rmdup_qfilter.bam Sample1_sorted_rmdup.bam

Sooooo, when I tried to do BAM index, using this command:

java -Xms4g -jar picard.jar BuildBamIndex INPUT=Sample1_sorted_rmdup_qfilter.bam

I got:

Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing SAM header. @RG line missing SM tag. Line: @RG ID:Sample1; File /path/to/BAM/Sample1.bam; Line number 1192

I checked the file, and it looks like this:

@HD VN:1.5 SO:coordinate @SQ SN:Chr01 LN:56831624 @SQ SN:Chr02 LN:48577505 @SQ SN:Chr03 LN:45779781 @SQ
SN:Chr04 LN:52389146 @SQ SN:Chr05 LN:42234498 @SQ
SN:Chr06 LN:51416486 @SQ SN:Chr07 LN:44630646 @SQ
SN:Chr08 LN:47837940 @SQ SN:Chr09 LN:50189764 @SQ
SN:Chr10 LN:51566898 @SQ SN:Chr11 LN:34766867 @SQ
SN:Chr12 LN:40091314 @SQ SN:Chr13 LN:45874162 @SQ
SN:Chr14 LN:49042192 @SQ SN:Chr15 LN:51756343 @SQ
SN:Chr16 LN:37887014 @SQ SN:Chr17 LN:41641366 @SQ
SN:Chr18 LN:58018742 @SQ SN:Chr19 LN:50746916 @SQ
SN:Chr20 LN:47904181 @SQ SN:scaffold_21 LN:3565126 @SQ
SN:scaffold_22 LN:1240113 @SQ SN:scaffold_23 LN:809636 @SQ
SN:scaffold_24 LN:735592 @SQ SN:scaffold_25 LN:750012 @SQ
SN:scaffold_26 LN:719293 @SQ SN:scaffold_27 LN:425344 @SQ
SN:scaffold_28 LN:367934 @SQ SN:scaffold_30 LN:374509 @SQ
SN:scaffold_31 LN:306967 @SQ SN:scaffold_32 LN:273180 @SQ
SN:scaffold_33 LN:367064 @SQ SN:scaffold_34 LN:312168 @SQ
SN:scaffold_35 LN:412299 @SQ SN:scaffold_36 LN:357887 @SQ
SN:scaffold_37 LN:303488 @SQ SN:scaffold_38 LN:280888 @SQ
SN:scaffold_39 LN:308105 @SQ SN:scaffold_40 LN:266805 @SQ
SN:scaffold_41 LN:255068 @SQ SN:scaffold_43 LN:313007 @SQ
SN:scaffold_44 LN:177731 @SQ SN:scaffold_47 LN:277228 @SQ
SN:scaffold_48 LN:336578 @SQ SN:scaffold_49 LN:240486 @SQ
SN:scaffold_50 LN:189765 @SQ SN:scaffold_51 LN:202321 @SQ
SN:scaffold_54 LN:193136 @SQ SN:scaffold_55 LN:182568

Then, I did:

samtools view -H Sample1_sorted_rmdup_qfilter.bam | sed 's,^@RG.*,@RG\tID:Sample1,g' |  samtools reheader - Sample1_sorted_rmdup_qfilter.bam > Sample1_sorted_rmdup_qfilter_reheader.bam

And tried to run:

java -Xms4g -jar picard.jar BuildBamIndex INPUT=Sample1_sorted_rmdup_qfilter_reheader.bam

and I got the same:

Error parsing SAM header. @RG line missing SM tag. Line: @RG
ID:Sample1; File /path/to/BAM/Sample1_sorted_rmdup_qfilter_reheader.bam; Line number 1192

Could someone please help me again?

Thanks in advance.

Assembly SNP • 6.4k views
ADD COMMENT
1
Entering edit mode
6.5 years ago

you've specified a READ GROUP without a sample name:

bwa mem -M -R '@RG\tID:Sample1'

you need to specify a sample name.

bwa mem -M -R '@RG\tID:Sample1\tSM:SAMPLE1'

or you can always fix the bam using picard AddOrReplaceReadGroups : https://broadinstitute.github.io/picard/command-line-overview.html

ADD COMMENT
0
Entering edit mode

Thank you, Pierre.

So, I am not sure if I did it right:

java -jar picard.jar AddOrReplaceReadGroups I=Sample1_sorted_rmdup_qfilter.bam O=Sample1_sorted_rmdup_qfilter_fixed.bam RGID=Sample1 RGLB=lib1 RGPL=illumina RGPU=unit1 RGSM=SAMPLE1

But I got an error while trying to run it:

Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing SAM header. @RG line missing SM tag. Line:

What do you think I did wrong?

ADD REPLY
0
Entering edit mode

oh, even picard doesn't want your RG. So, a ID alone is not enough, you have to set it from the beginning using bwa....

ADD REPLY
0
Entering edit mode

Oh no! I already did like 30 samples mapping. :(... If there is no other way... Let's do it.

Thank you very much for your help, Pierre!

ADD REPLY
0
Entering edit mode

If there is no other way

thinking, you could also use sed:

samtools view -h  in.bam sed '/^@PG/s/ID:sample1/ID:sample1\tSM:sample1/' | samtools view -Sb  -o out.bam -
ADD REPLY
0
Entering edit mode

Thanks again, Pierre. But still didn't work.

ADD REPLY
0
Entering edit mode

ops typo

not /^@PG/ but /^@RG/

ADD REPLY

Login before adding your answer.

Traffic: 2925 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6