Hi all,
We are thinking about ways to make our production pipeline run faster. Right now we're settled on aligner, but everything to get us from the SAM/BAM creation up to a sorted, merged, duplicate marked BAM could be updated.
We'll need tools to help us:
- Sort
- merge
- mark duplicates
- flagstats
- make bam indices (.bai)
On my list of tools to evaluate, I have some combination of the following:
- samtools
- picard
- sambamba
- samblaster
Are there any tools that I am missing? Are there any combinations of tools that people find particularly effective?
Right now, our current workflow involves
- Align with Bwa
- Convert to BAM and sort (samtools)
- Duplicate Mark BAM (Picard)
- Merge and Duplicate mark all the lanes for a sample (Picard)
Looking forward to your suggestions!
+1 for GNU parallel. Great tool.
I'll check out elPrep and speedseq