5.6 years ago by
Hi Nathan, you'll not find concrete examples of that since it won't work with the current implementation of samtools. In brief, the header contains a chromosome/contig list and individual reads simply have a numeric index into this list, for use with viewing and sorting. In the current implementation, samtools grabs the header from one file and uses that for subsequent files, so the numeric index->chromosome/contig name pairing will be off for the remainder of your files. The same issues hold for the reheader command. It simply changes the header, without touching the reads (this is convenient if you need to change "chr1" into "1", but will completely ruin a file if you rearrange the order at all, since you might end up with "chr2" turning into "17"). Someone could implement a merge to handle your case, but I'm not aware of anyone having done so. What you might do is create a header that will work for all of your files and then "samtools view file.bam >>header.sam" all of the files together. You can then convert back to BAM and resort. This is pretty far from ideal, but the only way that I'm aware of that would work without reimplementing the samtools merge command, which would have some implementation ambiguities.