As we all know, bam file required by GATK need header in it. We can explicitly add it via the -R parameter.But,if you didn't use -R ,you also can use samtools -T genome.fa to add header. What is the difference between the two?
As we all know, bam file required by GATK need header in it. We can explicitly add it via the -R parameter.But,if you didn't use -R ,you also can use samtools -T genome.fa to add header. What is the difference between the two?
OK, now we can talk about this ;)
I would recommend you also have a look at the sam specs to understand what type of header informations (page 3ff.) exists in a bam/sam file.
The -R
parameter during alignment adds a ReadGroup to the bam file. Doing this you can store informations like SampleName or used platform into the bam file. This is useful and often neccessary for downstream analyses like variant calling. It is best practice to always include a ReadGroup with at least the sample name.
What every aligner should automaticly store in the bam header is an informationen about the contigs (name and length) used during alignment. These information are introduced by @SQ
. But this is not mandatory. But just like the ReadGroup there are tools that need this information. For example samtools view
. If this information is missing you can use the -T
parameter to provide the reference genome file used for alignment and samtools
will extract the information it needs from there instead using the header.
fin swimmer
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hello flystar233 ,
what exact
samtools
command you mean? Depending on that, the-T
parameter means very different things.fin swimmer
oh,sorry,please see code:
-