Same sample with seperete bam files
1
1
Entering edit mode
8.9 years ago
nuketbilgen ▴ 40

Hi,

I have two (or more) .bam files for same sample... I want to merge them in to one bam file. Which I tired with samtools merge command. But in the output file I have different SM codes and both are in that file. The

samtools merge out.bam in1.bam in2.bam in3.bam

@RG    ID:N15    PL:ILLUMINA    SM:N15
@RG    ID:N17    PL:ILLUMINA    SM:N17
@PG    ID:bwa    PN:bwa    VN:0.6.1-r104-tpx
@PG    ID:bwa-59218C57    PN:bwa    VN:0.6.1-r104-tpx

When I use this file in my pipeline for generating g.vcf files, the walker haplotypecaller is not working due to different sample names. How can I overcome this?

Thank you

merge bam • 7.0k views
ADD COMMENT
1
Entering edit mode
8.9 years ago

You should modify the header of the merged bam. The fastest way would be to use reheader command from samtools.

samtools view -H Input.bam > header.sam
sed "s/N17/N15/" header.sam > new_header.sam
samtools reheader new_header.sam Input.bam
ADD COMMENT
0
Entering edit mode

Hi,

Thank you

@RG    ID:N15    PL:ILLUMINA    SM:N17
@RG    ID:N15    PL:ILLUMINA    SM:N15

But Haplotypecaller did not like it again.

ERROR MESSAGE:

Input file: SAMFileHeader{VN=1.4, GO=none, SO=coordinate} contains more than one RG with the same id (N15)
ADD REPLY
2
Entering edit mode

My bad. I didn't see that you have named your RG IDs same as sample IDs. This is not recommended. Anyways RG ID's should be unique or you can't have the same RG ID on multiple lines in header. It will throw an error. Do you still have the original BAM file ? I mean with out.bam file with this header:

@RG    ID:N15    PL:ILLUMINA    SM:N15
@RG    ID:N17    PL:ILLUMINA    SM:N17
@PG    ID:bwa    PN:bwa    VN:0.6.1-r104-tpx
@PG    ID:bwa-59218C57    PN:bwa    VN:0.6.1-r104-tpx

If yes, then please retry the modified command as mentioned below:

samtools view -H Input.bam > header.sam
sed "s/SM:N17/SM:N15/" header.sam > new_header.sam
samtools reheader new_header.sam Input.bam
ADD REPLY
0
Entering edit mode

Hi again.

It worked. Thank you. I did not get an error this time :)

Thank you very much!

ADD REPLY
0
Entering edit mode

No problem. Just go through the https://www.broadinstitute.org/gatk/guide/article?id=3059 page to understand more about RG, LB and SM tags in BAM format. RG can be considered as lane for simple purposes.

ADD REPLY
0
Entering edit mode

Will do that. And one more question, If I have 4 samples to merge than I would just add SM:Ns like this?

s/SM:N17/SM:N15/SM:N14/SM:N16/
ADD REPLY
1
Entering edit mode

You mean you need to rename SM:N17,SM:N14,SM:N16 to SM:N15 right? You can do it one at a time. You can use

sed -i "s/SM:N17/SM:N15/" header.sam will change the original header file so no redirecting of the output to a new file.
sed -i "s/SM:N17/SM:N15/" header.sam
sed -i "s/SM:N14/SM:N15/" header.sam
sed -i "s/SM:N16/SM:N15/" header.sam
ADD REPLY
0
Entering edit mode

OK. I will do as you say. That is kind of a puzzle though.

Thank you again.

ADD REPLY

Login before adding your answer.

Traffic: 2721 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6