I just noticed something while constructing
bai files using picard tools'
BuildBamIndex. The command I normally use is:
java -jar -Xmx4g BuildBamIndex.jar I=foo.bam O=foo.bam.bai
I did the same for a bam file that I mapped using
bwa 0.5.9 against a recently sequenced library and obtained this error:
SAM validation error: ERROR: Record 48168030, Read name FOO, MAPQ should be 0 for unmapped read
From seqanswers and biostars forums, I found that this could happen while using BWA and that one should use
VALIDATION_STRINGENCY=LENIENT with picard. Picard threw a warning this time ignoring those reads to successfully create bai files. What concerns me is the size of the index files.
In my bam files sequenced before, the bam files were around
4.5GB and the bai files were about
370KB. However, in this library, the bam files are around
5.5GB and the bai files are about
206KB. Both of the data sets come from Arabidopsis thaliana, though the experiments are different. I should mention that both of them were run using picard-1.65 with same parameters and also with COMPRESSION_LEVEL=5 default. Within the files mapped against bwa from the recently sequenced library, however, I get bai files around the same size. I am wondering if this has anything to do with the error/warning. Is this something to worry about? How can I verify if this index file is right??