Question: Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing SAM header. @RG line missing SM tag. Line: @RG ID:Sample1; File /path/to/BAM/Sample1.bam; Line number 1192
0
gravatar for valopes
7 months ago by
valopes10
valopes10 wrote:

Hi everyone,

Guys, I've been struggling with this bioinformatic and now I have a new problem. So, I checked this post Error parsing SAM header. @RG line missing SM tag. Line: @RG ID:None and It didn't work to me.

First I did the mapping:

bwa mem -M -R '@RG\tID:Sample1' -t 6 genome.ref Sample1_R1_forward_paired.trim.fastq.gz Sample1_R2_forward_paired.trim.fastq.gz | samtools view -hSb -o Sample1.bam -

Then I sorted by coordinate:

samtools sort Sample1.bam -o Sample1_sorted.bam

I removed duplicates:

java -Xms4g -jar picard.jar MarkDuplicates INPUT=Sample1_sorted.bam OUTPUT=Sample1_sorted_rmdup.bam METRICS_FILE=Sample1_sorted_rmdup.txt2 REMOVE_DUPLICATES=true VALIDATION_STRINGENCY=LENIENT

Filtered per quality:

samtools view -hSbq 30 -o Sample1_sorted_rmdup_qfilter.bam Sample1_sorted_rmdup.bam

Sooooo, when I tried to do BAM index, using this command:

java -Xms4g -jar picard.jar BuildBamIndex INPUT=Sample1_sorted_rmdup_qfilter.bam

I got:

Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing SAM header. @RG line missing SM tag. Line: @RG ID:Sample1; File /path/to/BAM/Sample1.bam; Line number 1192

I checked the file, and it looks like this:

@HD VN:1.5 SO:coordinate @SQ SN:Chr01 LN:56831624 @SQ SN:Chr02 LN:48577505 @SQ SN:Chr03 LN:45779781 @SQ
SN:Chr04 LN:52389146 @SQ SN:Chr05 LN:42234498 @SQ
SN:Chr06 LN:51416486 @SQ SN:Chr07 LN:44630646 @SQ
SN:Chr08 LN:47837940 @SQ SN:Chr09 LN:50189764 @SQ
SN:Chr10 LN:51566898 @SQ SN:Chr11 LN:34766867 @SQ
SN:Chr12 LN:40091314 @SQ SN:Chr13 LN:45874162 @SQ
SN:Chr14 LN:49042192 @SQ SN:Chr15 LN:51756343 @SQ
SN:Chr16 LN:37887014 @SQ SN:Chr17 LN:41641366 @SQ
SN:Chr18 LN:58018742 @SQ SN:Chr19 LN:50746916 @SQ
SN:Chr20 LN:47904181 @SQ SN:scaffold_21 LN:3565126 @SQ
SN:scaffold_22 LN:1240113 @SQ SN:scaffold_23 LN:809636 @SQ
SN:scaffold_24 LN:735592 @SQ SN:scaffold_25 LN:750012 @SQ
SN:scaffold_26 LN:719293 @SQ SN:scaffold_27 LN:425344 @SQ
SN:scaffold_28 LN:367934 @SQ SN:scaffold_30 LN:374509 @SQ
SN:scaffold_31 LN:306967 @SQ SN:scaffold_32 LN:273180 @SQ
SN:scaffold_33 LN:367064 @SQ SN:scaffold_34 LN:312168 @SQ
SN:scaffold_35 LN:412299 @SQ SN:scaffold_36 LN:357887 @SQ
SN:scaffold_37 LN:303488 @SQ SN:scaffold_38 LN:280888 @SQ
SN:scaffold_39 LN:308105 @SQ SN:scaffold_40 LN:266805 @SQ
SN:scaffold_41 LN:255068 @SQ SN:scaffold_43 LN:313007 @SQ
SN:scaffold_44 LN:177731 @SQ SN:scaffold_47 LN:277228 @SQ
SN:scaffold_48 LN:336578 @SQ SN:scaffold_49 LN:240486 @SQ
SN:scaffold_50 LN:189765 @SQ SN:scaffold_51 LN:202321 @SQ
SN:scaffold_54 LN:193136 @SQ SN:scaffold_55 LN:182568

Then, I did:

samtools view -H Sample1_sorted_rmdup_qfilter.bam | sed 's,^@RG.*,@RG\tID:Sample1,g' |  samtools reheader - Sample1_sorted_rmdup_qfilter.bam > Sample1_sorted_rmdup_qfilter_reheader.bam

And tried to run:

java -Xms4g -jar picard.jar BuildBamIndex INPUT=Sample1_sorted_rmdup_qfilter_reheader.bam

and I got the same:

Error parsing SAM header. @RG line missing SM tag. Line: @RG
ID:Sample1; File /path/to/BAM/Sample1_sorted_rmdup_qfilter_reheader.bam; Line number 1192

Could someone please help me again?

Thanks in advance.

snp assembly • 433 views
ADD COMMENTlink modified 7 months ago by Pierre Lindenbaum107k • written 7 months ago by valopes10
1
gravatar for Pierre Lindenbaum
7 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum107k wrote:

you've specified a READ GROUP without a sample name:

bwa mem -M -R '@RG\tID:Sample1'

you need to specify a sample name.

bwa mem -M -R '@RG\tID:Sample1\tSM:SAMPLE1'

or you can always fix the bam using picard AddOrReplaceReadGroups : https://broadinstitute.github.io/picard/command-line-overview.html

ADD COMMENTlink written 7 months ago by Pierre Lindenbaum107k

Thank you, Pierre.

So, I am not sure if I did it right:

java -jar picard.jar AddOrReplaceReadGroups I=Sample1_sorted_rmdup_qfilter.bam O=Sample1_sorted_rmdup_qfilter_fixed.bam RGID=Sample1 RGLB=lib1 RGPL=illumina RGPU=unit1 RGSM=SAMPLE1

But I got an error while trying to run it:

Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing SAM header. @RG line missing SM tag. Line:

What do you think I did wrong?

ADD REPLYlink modified 7 months ago • written 7 months ago by valopes10

oh, even picard doesn't want your RG. So, a ID alone is not enough, you have to set it from the beginning using bwa....

ADD REPLYlink written 7 months ago by Pierre Lindenbaum107k

Oh no! I already did like 30 samples mapping. :(... If there is no other way... Let's do it.

Thank you very much for your help, Pierre!

ADD REPLYlink written 7 months ago by valopes10

If there is no other way

thinking, you could also use sed:

samtools view -h  in.bam sed '/^@PG/s/ID:sample1/ID:sample1\tSM:sample1/' | samtools view -Sb  -o out.bam -
ADD REPLYlink written 7 months ago by Pierre Lindenbaum107k

Thanks again, Pierre. But still didn't work.

ADD REPLYlink written 7 months ago by valopes10

ops typo

not /^@PG/ but /^@RG/

ADD REPLYlink written 7 months ago by Pierre Lindenbaum107k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 914 users visited in the last hour