Question: Same sample with seperete bam files
1
gravatar for nuketbilgen
3.8 years ago by
nuketbilgen30
United Kingdom
nuketbilgen30 wrote:

Hi, 

I have two (or more) .bam files for same sample... I want to merge them in to one bam file. Which I tired with samtools merge command. But in the output file I have different SM codes and both are in that file. The 

samtools merge out.bam in1.bam in2.bam in3.bam

@RG    ID:N15    PL:ILLUMINA    SM:N15

@RG    ID:N17    PL:ILLUMINA    SM:N17

@PG    ID:bwa    PN:bwa    VN:0.6.1-r104-tpx

@PG    ID:bwa-59218C57    PN:bwa    VN:0.6.1-r104-tpx

When I use this file in my pipeline for generating g.vcf files, the walker haplotypecaller is not working due to different sample names. How can I overcome this?

Thank you...

 

sample same bam merge • 2.6k views
ADD COMMENTlink written 3.8 years ago by nuketbilgen30
1
gravatar for Ashutosh Pandey
3.8 years ago by
Philadelphia
Ashutosh Pandey11k wrote:

You should modify the header of the merged bam. The fastest way would be to use reheader command from samtools. 

samtools view -H Input.bam > header.sam

sed "s/N17/N15/" header.sam > new_header.sam

samtools reheader new_header.sam Input.bam
ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Ashutosh Pandey11k

Hi,

Thank you,

@RG    ID:N15    PL:ILLUMINA    SM:N17

@RG    ID:N15    PL:ILLUMINA    SM:N15

But Haplotypecaller did not like it again.

ERROR MESSAGE: Input file: SAMFileHeader{VN=1.4, GO=none, SO=coordinate} contains more than one RG with the same id (N15)

ADD REPLYlink written 3.8 years ago by nuketbilgen30
2

My bad. I didn't see that you have named your RG IDs same as sample IDs. This is not recommended. Anyways RG ID's should be unique or you can't have the same RG ID on multiple lines in header. It will throw an error. Do you still have the original BAM file ? I mean with out.bam file with this header:

@RG    ID:N15    PL:ILLUMINA    SM:N15

@RG    ID:N17    PL:ILLUMINA    SM:N17

@PG    ID:bwa    PN:bwa    VN:0.6.1-r104-tpx

@PG    ID:bwa-59218C57    PN:bwa    VN:0.6.1-r104-tpx

If yes, then please retry the modified command as mentioned below:

samtools view -H Input.bam > header.sam
sed "s/SM:N17/SM:N15/" header.sam > new_header.sam
samtools reheader new_header.sam Input.bam

 

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by Ashutosh Pandey11k

Hi again.

It worked. Thank you. I did not get an error this time :) 

Thank you very much!!!!!

ADD REPLYlink written 3.8 years ago by nuketbilgen30

No problem. Just go through the https://www.broadinstitute.org/gatk/guide/article?id=3059 page to understand more about RG, LB and SM tags in BAM format. RG can be considered as lane for simple purposes.

ADD REPLYlink written 3.8 years ago by Ashutosh Pandey11k

Will do that. And one more question, If I have 4 samples to merge than I would just add SM:N s like this?

s/SM:N17/SM:N15/SM:N14/SM:N16/

ADD REPLYlink written 3.8 years ago by nuketbilgen30
1

You mean you need to rename SM:N17,SM:N14,SM:N16 to SM:N15 right? You can do it one at a time. You can use 

sed -i "s/SM:N17/SM:N15/" header.sam will change the original header file so no redirecting of the output to a new file. 

sed -i "s/SM:N17/SM:N15/" header.sam

sed -i "s/SM:N14/SM:N15/" header.sam

sed -i "s/SM:N16/SM:N15/" header.sam

 

ADD REPLYlink written 3.8 years ago by Ashutosh Pandey11k

OK. I will do as you say. That is kind of a puzzle though. 

Thank you again.

ADD REPLYlink written 3.8 years ago by nuketbilgen30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1215 users visited in the last hour