Question: Aligning, Sorting and Converting to bam at the same command - possible?
0
gravatar for Sam
10 months ago by
Sam70
Sam70 wrote:

Is it possible to align, sort and convert to bam using a pipeline?

i.e.

bowtie .... | samtools sort | samtools view -bS -o sorted_output.bam

In case that is possible, is this solution very inefficient in terms of demands on the pc?

bam sorting alignment • 434 views
ADD COMMENTlink modified 10 months ago by swbarnes29.2k • written 10 months ago by Sam70
2

not too familiar with bowtie but yes if it produces sam with a header or bam. No need to do a samtools view after, samtools sort handles bam files. Why would it be inefficient? having to wait for the entire file to sort chunks, increasing your storage 2X is more efficient?

ADD REPLYlink written 10 months ago by Gabriel R.2.8k

If you think you'll be doing markdup at some point then you may also want to add a "samtools fixmate -m" in there after the bowtie command as this way it doesn't require an additional sort later on. Also when piping it's often best to pipe uncompressed BAM. Some samtools commands have a "-u" options while others need "-l 0" and others have no option so need to add it in to -O instead. Rather unfortunate lack of consistency.

Eg:

bowtie ... | samtools fixmate -m -O bam,level=0 - - | samtools sort -l 0 | samtools markdup - sort_markdup.bam

Depending on speeds, you may want to add threading in there ("-@ 8" etc) to specific commands. Some do better than others, but note with samtools 1.10 it can now multi-thread the SAM parsing too, which could sometimes be a bottleneck in the past if matched up to a high thread count aligner.

ADD REPLYlink written 10 months ago by jkbonfield430
3
gravatar for swbarnes2
10 months ago by
swbarnes29.2k
United States
swbarnes29.2k wrote:

Most people pipe everything through like that. You probably don't need the view command, I'm pretty sure newer versions of samtools sort will take .sam as input and always output .bam

ADD COMMENTlink written 10 months ago by swbarnes29.2k
2

Samtools outputs whatever you specify, either via suffixlike .sam or .bam or via the -O parameter (SAM/BAM), so yes the view is not necessary. It is a good approaches to use piles and it saves time by avoiding intermediate files which have to be written to disk. The larger your memory, the more efficient it is. samtools sort has an option -m to specify how many RAM to use for sorting before spilling data to disk as intermediate file once allocated memory is full. You pipes can be arbitrarily long in theory, I have commands in some specialized pipelines that go through 10 tools without producing a single intermediate file.

ADD REPLYlink modified 10 months ago • written 10 months ago by ATpoint42k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1509 users visited in the last hour