Entering edit mode
                    4.9 years ago
        steve
        
    
        ★
    
    3.5k
    Preferably, using either standard GNU tools, or perhaps something in the Python standard library. Any ideas?
Preferably, using either standard GNU tools, or perhaps something in the Python standard library. Any ideas?
Bam file is a binary file stroing alignment information, so we must need htslib.h api to interprete that. if you use GNU, you need htslib.h. if you use python, pysam must be the best choice for you.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
if the bam is uncompressed, you could juste useBut samtools is much better.wc -lwhich will return the number of lines (= the number of alignments assuming there is no unaligned entry in the file).edit: I got it wrong, see comments below.
dont think I have ever seen a case where people actually use uncompressed bam file, at that point you might as well be using sam format
It is used when one need to pipe a bam file in another process (saves compression time, as said below by Jorge).
I think by definition, a bam is compressed.
not with
samtools view -buI thinkEven if uncompressed, which is recommended for piping purposes in order to save time compressing and decompressing output and input respectively, a bam file is still binary, so
wc -lwill not work.You are both right. I tested it to make sure:
Just a couple of comments:
-uimplies-b, thereforesamtools view -u input.bamis enough to get an uncompressed bam file.Also, if you only need to know the number of reads, generating all flagstat metrics is not that efficient, and
samtools view -c input.bamwould be sufficient. It won't be that faster, since most of the time is spent in decompressing and reading the bam file, but it neither will be slower plus it's simpler to write down.