There is no samtools command to do what you want to achieve.
So, first of all, "adjusting" the headers to only mapped reads is not something required if you want to process the BAM file. I mean, if you have reads mapped on chr1 and chr2, even if chr3 and chr4 still in the headers they will not bother you.
Nevermind, if you still wanting "adjusting" the headers, you will need to write a little script for this. To do so, several options:
- Using HTSLIB, the bam_hdr_t structure will allow you to modify what you need. From this, you can easily write a C program to write a new BAM with ONLY mapped reads and their "adjusted" reference.
- If you are not familiar with C, you can use Python or an other language to parse the BAM and then check which one of the reference hasn't got reads (using Pysam for instance).
- Using BASH commands, using @Pierre Lindenbaum solution. From the command he provided, you will have a list of UNIQUE references with at least 1 mapped read. You can then removed from the header the references not mentioned in the results.
samtools view -F 4 in.bam | cut -f 3 | sort | uniq <- gives you the list of headers with at least 1 read mapped.
samtools view -H in.bam > adj.sam <- writes headers in the sam format, you just have to removed bad refs.
samtools view -F 4 in.bam >> adj.sam <- writes all the mapped reads in the sam format.
samtools view -h -b -S adj.sam > adj.bam <- Get a bam.
This "hand-fix" solution is good if you only have 1 bam to process. Otherwise, I recommend writing a script / little program to do this in an automatized way. ;)