from .BAM to .BAI using samtools
5
14
Entering edit mode
8.5 years ago
Kizuna ▴ 860

Hi !

I have many .bam that I want to get their .bai using samtools in the terminal.

I tried the following command :

samtools index *.bam


However, I did not get any .bai file.

Regards

sequencing samtools next-gen • 105k views
20
Entering edit mode
8.5 years ago

using GNU parallel:

parallel  samtools index ::: *.bam

0
Entering edit mode

thanks this is useful. Do you know what is the optimal way of setting -j ? Without setting it does it just default to max cores? Or should I set it to something total core minus 1. That is if I have 8 then I will just set it to 7? thanks.

0
Entering edit mode

From the manual:

0 means as many as possible. Default is 100% which will run one job per CPU on each machine.

So you shouldn't normally have to worry about it unless, for example, you want to constrain the number of jobs to be less than your number of CPU cores, or you want two jobs per CPU core.

6
Entering edit mode
8.5 years ago
donfreed ★ 1.6k

Samtools index only accepts a single input file, so using a shell metacharacter to specify multiple files will not work. I usually use a shell wrapper to run samtools index on a single file at a time.

#!/usr/bin/env bash

# index_all.sh

for INFILE in "$@" do samtools index$INFILE
done


Then it is simple to run:

./index_all.sh /directory/*.bam

8
Entering edit mode
8.5 years ago

You need to point the results to a file to create this:

So for one file it would be

samtools index file.bam  file.bam.bai


See this link for a great description.

In your case with many bam files I would do it in a shell script as follows:

#!/bin/bash
for I in *.bam
do
echo "Indexing: "$i samtools index$i $i".bai" done  ADD COMMENT 3 Entering edit mode The samtools index foo.bam foo.bam.bai syntax won't work with the two most recent versions of samtools. The way to do this is simply samtools index foo.bam. Enabling people to specify alternate index filenames is low on the priority list. ADD REPLY 0 Entering edit mode @Devon Ryan: Yes indeed, with my samtools version I need to write samtools index foo.bam So if I understood well , You are saying that the *.bam won't work with my samtools version? ADD REPLY 4 Entering edit mode *.bam won't work with any samtools version. I should note that understanding why this is the case will be helpful for you in general (this will be the case for many tools), so I'll explain that. Assume you're in a directory with three BAM files: A.bam, B.bam and C.bam. When you type samtools index *.bam, your shell sees *.bam and expands it. Consequently, what samtools sees you as running is samtools index A.bam B.bam C.bam. That'd be fine if samtools index could accept more than one input file at a time, but it can't. Further, it may either then see you as using the alternate syntax that Jonathan Crowther mentioned or simply die due to not knowing what to do (I'd have to check the source code, though I expect the former would happen. So, what you want to do is simply iterate over the files with a for loop: for f in *.bam do samtools index$f
done
0
Entering edit mode

0
Entering edit mode

@Jonathan Crowther : Tx! I will give it a try

4
Entering edit mode
3.8 years ago
eidriangm ▴ 40

bash one-liner loop:

# Considering that you are inside the folder where the bams are: for bam in $(ls *.bam); do samtools index$bam; done

1
Entering edit mode

This was the only solution working for me on WSL Ubuntu working on files that are stored in Windows folder! Donfreeds solution gave me "samtools-command not found error" and Pierre Lindenbaum's solution didn't end in my case... doing nothing. Thanks for posting!

0
Entering edit mode

this works fine given that parallel installed " parallel samtools index ::: *.bam" as @Pierre Lindenbaum answered

1
Entering edit mode
3.2 years ago

Convert and sort sam to bam

samtools view -bS file.sam | samtools sort -o file.bam


Create an index for the sorted bam

samtools index file.bam file.bai


samtools view -bS file.sam | cut -f1 | sort | uniq | wc -l

2
Entering edit mode

thanks for contributing. Please be sure though that you do not repost answers.

This answer has already been given 5 times in this thread including the accepted one. Also note that current samtools versions do not require a view command prior to sort. sort will happily read SAM and output BAM, please check the manual for details (-O BAM option and automatic format detection based on output file suffix with -o).

Also the read counting is not correct. First the view will output a BAM file so a binary format that is not compatible with cut etc. commands. Also even if it was a SAM file it would count the header (if you print it via samtools view -h) but in any case it counts all reads (= also unmapped ones) so the result is not reliable. Use samtools flagstat instead which is specialized code for exactly what you want to do. This works both on SAM/BAM/CRAM format.