What is the best practice for BAM sorting?
1
2
Entering edit mode
6.5 years ago

Hi guys,

I am aligning RNA-seq data using STAR and will need in later steps to use sorted BAM files. I was wondering what people suggest to use for BAM sorting? The options that I have considerer/used include:

  • STAR --outSAMtype BAM SortedByCoordinate - but this crashes because of memory (we only have 128GB of RAM our server). That is when I'm merging lanes and have paired-end reads using STAR
  • e.g STAR --ReadFilesIn read1_lane1_r1,fq,read1_lane2_r1 read1_lane1_r2,read1_lane2_r2 This is fixable by limiting RAM STAR --limitBAMsortRAM

OR

  • simply output unsorted BAM files from STAR (which I am doing now) and use samtools sort -b -o outSorted.bam inFromSTAR.bam

OR

  • Are there any other suggestions for the best practice to get sorted BAM files?

Are there performance differences between STAR and samtools for example? If anyone knows. And Is sorting algorithms the same similar between different tools?

Cheers,

Forum STAR samtools RNA-Seq • 6.6k views
ADD COMMENT
1
Entering edit mode
6.5 years ago
poisonAlien ★ 3.1k

Sorting tool is your choice of preference. Either STAR, samtools sort, picrad sortSam or sambamba-sort they all do the same thing. (samtools sort and sambamba are multithreaded and works much faster. Plus sambamba indexing works at lightening speed). But unsorted bam file works much faster when you are using count tools such as featureCounts. (It takes like ~5 mins for assigning 60 million reads to genes for unsorted bam file against 60 mins for position sorted, of which most of the time is spent on format conversion).

May be you do second option, output as unsorted bam files, get counts using featreCounts for reference gtf (if this is what you are upto) and sort it using any of the above tool and remove unsorted one if you want to save the disk space. 

ADD COMMENT
0
Entering edit mode

Great, thanks for answer. I know that featureCounts is faster than htseq-count. Would you say the same for htseq-count that it works faster on the unsorted bam? thanks

ADD REPLY

Login before adding your answer.

Traffic: 1760 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6